HEPiX CASPUR, Rome Meeting Site Reports

HEPiX CASPUR Meeting Report - Site Reports

Site Reports presented at the Meeting held October 23rd - 25th, 1996

  1. Introduction

  2. Site Reports

  3. OCS by Lauri Loebel Carpenter/FNAL
    OCS stands for Operator Communications Software and it is a UNIX tool to provide tape volume handing across all platforms. FNAL uses the package for tape drive allocation and operator-assisted tape mounts; it also provides logging and statistics. Eleven different project teams use it plus a number of outside sites.

    The interface is by means of library calls, shell commands or via a MOTIF interface; the last of these is used exclusively by the operators because it has many features designed specifically for them, especially in the area of problem handling.

    A system administrator can define access rights to particular drives by group; a user can select criteria to select drives, to queue a request or not, read-only or read-write access, etc. The speaker showed a number of examples of OCS commands. followed by the structure of the various programs and the command message flows.

    Plans for the next version, version 3, include linking it to FTT (see later talk) underneath OCS and beyond that to add robot support and to integrate the currently-separate OCS databases.

  4. FMSS by Lisa Giacchetti/FNAL
    The Fermi Mass Storage System is an interface to a mass storage system based today on an IBM 3494 robot and used with HPSS software (FNAL is a member of the HPSS Consortium). It consists of a set of tools to centrally manage user data files. It offers transparent access to data without the user having to know where the data resides.. It also has special provisions for archiving small files and handling large files in a "bulk area". Group quotas are supported.

    Nodes must be registered to use FMSS and it is most-commonly found on central systems. Use can also be restricted to members of a given user table.

    The speaker showed a list of the various FMSS calls such as copy, remove, query, etc. The authors had maintained traditional UNIX formats for the commands where possible.

    The product was not at that time yet in production status, people were still migrating off the STK robot and Unitree scheme. It was hoped to complete beta testing and begin production use within 6 months.

    Future versions should permit direct import/export to tape. More staging options would be added.

  5. Status of the CERN Magnetic Tape Robot Acquisition by Harry Renshall
    A group of CERN users and internal and external tape specialists combined to define CERN's future needs for tape automation and a market survey based on this was issued late in 1995. As a result of the replies, it was decided to wait for a time, partly in order for more experience to be gained in emerging technologies and partly to allow CERN to bring into full production use its then-current robotics. It was clear that much more information should be learned from other sites. and there were many outside contacts as well as onsite tests.

    The objective has been defined as - no more manual tape mounts outside CERN working hours by the end of 1997 and a fully automatic scheme by the time LHC begins. Users should be encouraged to make more use of robotics and in the meantime much effort goes into tuning the tape library such that the busiest tapes are placed in the robots. There is also close cooperation with robotics vendors to improve their hardware and software.

    As a result of all these actions, the automatic mount rate does show signs of increasing at the expense of manual mounts and this has permitted the desired reduction of overnight operational support, thus promoting savings for investment in the next stage of automation.

    Renshall presented some details of the units under consideration, including performance figures. Tests performed at CERN in general supported the vendors' claims. All tapes gave satisfactory results and differences were in the area of purchase and/or operational costs. For the acquisition exercise currently underway, specific criteria were established such as mounts per hour, the need to demonstrate working sites with the equipment to be offered, the cost of ownership over several years, and so on.

    At the time of the meeting, a Call for Tender had been issued and it was hoped to take the result for adjudication to the December Finance Committee meeting and, if confirmed, to have the equipment installed for the 1997 LEP/SPS run.

  6. ZARAH Tape File System by O. Manczak/DESY
    Zarah is the data storage scheme for the ZEUS experiment at DESY. The data is write-once, read-many and it is aimed to store eventually 30TB. The ZEUS computing centre includes 2 SGI Challenge XL systems, tape robotics with 12TB capacity today and 1.2TB of disc space. 200 client workstations access this from on- and off-site and there are more than 250 users per week. Zarah provides a user interface to the hierarchical storage and offers overall a high performance, large capacity file service. The interface permits to hide the file stage process from the user as well as offering distributed access and scalability.

    Zarah is implemented via a fake NFS demon using RPC between clients and the servers. On the server side, there is a stub file system which is identical in structure to the real file system on the mass storage and a user's file tree structure is thus mapped.

    The speaker explained in detail the mechanism of staging a file. He stated that NFS was very slow and they used instead the RFIO protocol which offered other advantages over NFS as well as better speeds. They used a local implementation of RFIO and he explained how this affected and improved access to files. Speeds achieved had reached 40MB/second from an SGI server via HiPPI.

  7. FTT by Lauri Loebel Carpenter/FNAL
    FTT, Fermi Tape Tools, is a consolidation of various system-dependent tape I/O codes. The package offers support for various operating systems and tape types and the group responsible for it are always ready to add modules for more platforms or tapes if requested.

    FTT provides the basis for other data access packages such as OCS (see earlier), where FTT gives the basic tape access routines. It contains modules of library calls, a setuid binary and some test code. Use of the package provides system- and device-independence, statistics gathering and consistent error returns. It is easy to port to new platforms but there is a need to simplify more the procedures for building the required tables of tape characteristics and operating system features and to automate the necessary checking procedures. Another open point is a mechanism to determine which drives are available on a system, made more difficult since each architecture has its own commands. Work is also needed to improve the Porting Guide and it is then planned to include FTT in the Fermi Tools package for public distribution.

  8. A Roundtable Discussion on Tape Interchange Issues led by John Gordon/RAL
    The speaker first summarised the conclusions reached at the TRIUMF meeting of HEPiX (see URL http://wwwcn.cern.ch/hepix/meetings/triumf/minutes.html). The proposal to standardise on SL labels was already been seen more and more, for example at IN2P3, RAL and DESY/H1; and the maximum file size of 1GB had been set already for example at the ATLAS LHC experiment. However, in Vancouver, the meeting had been unable to arrive at consensus on a recommended transport medium and J.Gordon wished to return to this discussion; had an extra 6 months experience with various media helped us come to a conclusion?

    DLT 2000 units are still very popular, can this be the default? Or should we wait for DLT 4000 to become more widespread? Are the various DLT models (7000, 4000, 2000) at least downwards compatible in both read (claimed) and write (less certain)?

    H.Renshall stated the CERN position - they are user-driven and try to guarantee support for (almost) any reasonable media requested by CERN users. However, he agreed with the suggestion that if there was a HEP-wide recommendation, it might help to reduce diversity.

    It was reported by one delegate that Quantum had stopped manufacture of DLT 2000 models.

    After a long discussion, it was agreed to declare a recommended default for any site who sought advice while of course allowing any site who wished to support additional types. On this basis the default should be DLT 2000 media format although any new drive should be DLT 4000 or DLT 7000 (by this time the meeting was convinced that both these units could write DLT 2000 formats). This recommendation would not only provide the lowest common denominator for date interchange between the labs but also offer small sites the possibility to use the same drive (e.g. the DLT 4000) with sufficient local capacity for file backup.

  9. DANTE: Data Analysis and MC production cluster for Aleph by Silvia Arezzini and Maurizio Davini/INFN Pisa
    DANTE is a cluster made up of 7 HP 735 nodes and an HP D Series 350 twin-CPU server. There is also an HP Magneto-optical robot and a DLT robot. DANTE is a member of the Pisa AFS cell (all users have AFS home directories) and in fact it hosts the Pisa AFS cell. The tape cartridge robots run under the control of HP Openview Omnistorage. This provides a form of HSM (hierarchical storage management), it supports various magnetic media types and makes the storage areas appear as a single large file system. It is controlled via a Motif GUI.

    The cluster is used for ALEPH Monte Carlo and data analysis work. The standard ALEPH software modules are mirrored nightly from CERN and the results of the jobs on DANTE can be stored locally or written back via the network to SHIFT at CERN. One node of the cluster is used for interactive work and the others for NQS batch queues, the CERN flavour of NQS with the gettoken scheme for AFS tokens. The HEPiX login and X scripts are in use and there is an ASIS mirror.

    Among the problems seen have been occasional Ethernet congestion and a network upgrade is planned using HP's AnyLAN plus a major problem with NFS for which a patch was found. In the discussion, it was noted from the audience that NFS under the latest HP-UX releases showed significant performance improvements.

  10. Parallel Processing at CERN by Eric McIntosh/CERN
    The speaker gave first a review of the concept of parallel programming - there was clearly an opportunity to use many processors in parallel, how are we to split the job and then recombine the results. We must also be careful with the communications overhead.

    Parallelism at CERN began with the use of the late Seymour Cray's CDC 6600 and its various functional units. From there we moved to vector processors, to RISC computers, MPP systems, the GP-MIMD2 (a European Commission-funded project) and currently to the Meiko CS-2. This also is EU-funded with 3 industrial partners and 3 user sites including CERN. Meiko also sold two large systems to Lawrence Livermore Laboratory, which had the unfortunate side-effect of diverting vital support staff personal away from our project although this sale did assist in the debugging of the overall system. Recently, Quadrics has taken over Meiko, and its best experts, and more developments of the architecture are planned. So far, Quadrics have given excellent support and Eric also acknowedged here the support and help of Bernd Panzer of CERN in the work around the CS-2.

    The CS-2 is a SPARC-based machine with a proprietary internal network rated up to 50 MBps. It has a vector option but this is very expensive and only really useful with long vectors (not our situation). The initial SPARC chips used suffered poor floating point performance and recently Ross 100 MHz HyperSPARC chips have been installed. Memory is physically distributed but logically shared among the 64 nodes of 2 processors each. Each node also has a disc for local swapping. For data space, there is some 100GB of internal disc and 400GB externally. The Meiko parallel file system uses striping but the user sees only improved performance as more nodes are brought into use to access the discs.

    The systems are connected to the CERN network by FDDI but HiPPI will be tried with NA48. The operating system is Solaris 2.3 with extensions from Meiko, including an attractive system management tool permitting control of remote nodes. As set up, four nodes are designated control nodes for the whole system. Users' home directories are local, NFS-exported to all nodes in the system. Eight nodes are dedicated to PIAF, 5 for interactive logins and other nodes are used for batch or parallel tasks. AFS access is via the NFS exporter.

    There are three programming models available:

    Eric gave some examples of applications in operation

  11. The CERNVM Migration Experience by Harry Renshall/CERN
    When it was decided in December 1993 to close the CERNVM system, a special Task Force was established to concentrate and motivate the effort needed to effect this major change. At its peak, this task force involved some 30 people. The largest task was to build up a UNIX environment for physics and engineering users but it also required a specific marketing exercise to really get the message to users in a simple and timely fashion. Key technical areas included mail, editing and migration tools. In each case, various alternatives were studied and recommendations were made. Solutions independent of a particular UNIX flavour were given priority. The final list of recommended utilities was published (URL http://consult.cern.ch/umtf/1996/tools4certification). On the marketing side, a series of user seminars and open user meetings were held and much of the material presented is freely available on the web at URL http://wwwcn1.cern.ch/umtf/. There were also lots of articles in the CERN Computer Newsletter and extra copies of this were printed and distributed.

    After a slow start, the so-called User Migration Task Force (UMTF) was re-organised in October 1994 after the HEPiX meeting in Saclay/DAPNIA to include more input from other labs, notably DESY. The result was CUTE (Common UNIX and X Terminal Environment, see for example HEPiX Rio at URL http://wwwcn.cern.ch/hepix/meetings/rio/Rio_cute.ps). The first incarnation of CUTE was the interactive service opened on CERNSP, CERN's SP2.

    Most major batch work on CERNVM was stopped at the end of 1995 and transferred to CORE (see for example HEPiX Fermilab at URL http://dcdsv0.fnal.gov:8000/hepix/1094/talks/batch/shift.ps). At that time, the CERNVM configuration was halved in size. That left an interactive user base of some 11,000 accounts.

    The planned stop of CERNVM was originally end-96 but this was advanced by 6 months to end-June '96. Cleanup of the unused interactive accounts started early in 1996 and this removed some one-quarter of the remaining accounts. Gradually the CERNVM load was reduced until it was formally closed at the end of June. Migration of the VMarchive database (300 GB, of which some 105 GB was "active"; this was compressed to 35 GB) and users' minidisks (50 GB) to the RS/6000-based ADSM service was completed by the end of August. Tools were written to permit users to retrieve their own files from these archives but such occurrances have proved very rare indeed. Also by this time the client side of the ADSM backup service had been changed on user nodes to point to the RS/6000 version of this service.

    To cope with the expected problems, staff on duty in the Computer User Consultancy Office (UCO) was doubled throughout June and July and there was a great deal of individual "hand-holding" of users to help with their migration. During the last month of use, some 2500 users were mailed individually to encourage them to migrate either to UNIX/CUTE or to PC/NICE.

    After the official shutdown at midnight on June 30th, the system was run throughout July with no users except for 2 particular applications. After this, and once all the archive and minidisc files had been archived on ADSM, the system was finally shutdown, except for a brief restart in mid-August to retrieve via ADSM a disc for a UNIX client.

    Harry drew only a few lessons from the process, the main one being the amount of work needed to help users migrate and feel happy with their new environment. He was concerned that this might not always be possible in the future as support services were gradually scaled down. Also he stated that users often felt happier with fewer choices being presented to them, especially in the UNIX environment where there are always several ways of tackling any problem.

  12. Fermilab Farms by Lisa Giacchetti/FNAL
    At the time of the meeting, there were some 158 SGI nodes and 144 IBM systems in the Fermilab central processing farm, a total of about 8000 MIPS. All nodes have local disc, often very small capacity, and often minimal memory. There are also 10 larger nodes designated as I/O nodes with some 250 GB of disc space and some 90 Exabyte drives. Worker nodes are split by experiment into different Ethernet segments with one I/O node and one file system per segment and appropriate disc and tape allocation; great use is made of NFS for disc file access although bulk data moves are via TCP. This arrangement makes it rather cumbersome to effect changes in configuration, for example when an experiment's node allocation changes.

    Parallel processing is typically effected by CPS, a locally-produced socket-based scheme which distributes events to worker nodes. The base CPS package has been extended to provide a batch queuing scheme.

    Problems with the above arrangement include :-

    For the future there is known to be a need for much more capacity, both data and CPU, better inter-node communication links and more memory for physics analysis jobs. To this end, Fermilab issued a tender to the suppliers of the 4 platforms already present on site and as a result they are currently purchasing a further 45 SGI Challenge S nodes, 22 IBM 45P nodes, 2 more I/O nodes, one each from SGI and IBM, plus more disc and tape capacity.

    On the new farm, the configuration will be modified to use an Ethernet switch to permit a much more flexible allocation of worker nodes to experiments, including node sharing where appropriate. The new farm will have a single file system, possibly AFS, and this should eliminate the need for bulk file or data moves. All the equipment had arrived by the time of the meeting and the target was to have the new farm in production by January 1997.

    Speculating on possible futures, Lisa mentioned the suggestion of PC/Linux farms, multi-CPU systems, perhaps a Windows-NT farm. For the moment however, Fermilab intends to stay with Exabyte drives, largely for compatibility with the running experiments.

  13. Fixed Target Experiment Support under UNIX by Lauri Loebel Carpenter/FNAL
    The challenge facing the FNAL support group was how to support many UNIX systems coming from different sources staffed by teams of different skill levels. A prime requirement was obviously the adoption of standards, this especially in the area of operating systems and patches. It was also felt that centralised support would be most efficient, including call logging and support mailing lists. Also, make available plenty of documentation.

    A script was written to be run locally on target systems to gather information about their configuration; this information is then posted on a web page for each system. Another useful tool was found to be a Suggestions Document where one can easily find a list of Do's and Don't's.

    So far, there has been a good level of acceptance by the user community although concerns have been expressed about too-frequent SGI patch releases. Another currently-open question is system level backup: when should it happen, how often, and what to do if there is no local tape drive.

    Among the plans of the support group is the idea of creating a more automated method of installing the operating system on client nodes.

  14. CERN WGS and PLUS Services, an Update by Tony Cass/CERN
    This was a review of the Public Login UNIX Servers and dedicated UNIX Work Group Servers relative to what had been presented at previous HEPiX meetings, see for example the talk at HEPiX Rio on WGS, URL http://wwwcn.cern.ch/hepix/meetings/rio/Rio_cute.ps.

    The systems were installed with the appropriate certified version of the operating system (see HEPiX Vancouver , URL http://wwwcn.cern.ch/hepix/meetings/triumf/cern-certify.ps), then SUE tailoring (see for example HEPiX Prague, URL http://wwwcn.cern.ch/hepix/meetings/prague/CERN_sue_update.ps), then the WGS/PLUS customisation scripts and then tailored for each cluster according to use. Of course, AFS was vital for all of this. Several tools had been developed for performance monitoring, to identify trends, predict problems and so on. Users had requested some remote monitoring which would then include any networking effects. This will be added shortly.

    For the future ...

    Other topics under more or less active consideration are CDE, DFS and NT.

  15. FNALU, Revising the Vision by Lisa Giacchetti/FNAL
    FNALU is the multi-purpose central UNIX cluster at Fermilab. In its original version it was targetted at physics applications as well as some CPU-intensive batch work. It is AFS-based and multi-platform. Over time, the AFS servers themselves have migrated from AIX systems to SunOS, currently on 6 SUNs with 600 GB of disc space. Use of AFS has permitted a physical move of the cluster to take place without interruption of the user service. Overall, usage of FNALU has risen gradually, especially with the rundown of the FNAL VMS service.

    On the negative side of AFS, its performance was found to be poor for large files and this has led to the installation of 2 times 60 GB of local, non-AFS, non-backed-up disc space on the cluster nodes, dedicated to certain applications. Also, AFS proves to be less than optimal on multi-CPU systems, for example 20 node machines, giving system cache problems.

    There is now a move to merge CLUBS (the heavily I/O batch-oriented cluster) with FNALU in order to reduce the central support load and to provide a unique user interface. One implication would be a migration from Loadleveler to LSF. This would unify the user interface to services and offer a single queuing scheme, attractive to both users and system administrators

  16. Interactive Benchmarks by Claudia Bondila/CERN
    An attempt has been made to model and benchmark a typical CERN PLUS (see above) interactive load. It is hoped that this would help in fine-tuning nodes of the PLUS and WGS clusters and predict trends for future investment. Today's standard CERN benchmarks are batch only. Claudia has developed a Monte Carlo scheme to run simulated UNIX accounting and attempted to model a diverse applications mix. She uses accounting records because these are easier to generate and handle although it is realised that they only provide information on the end-state of a job with no details of the resources used over the life of the job. Also there is no information on network (including AFS) use and we cannot distinguish between shared and "normal" memory.

    Claudia explained the methodology she used: accounting data produces a huge amount of information and this needs to be compressed by splitting it into groups and categories. Users are grouped into "clusters", typifying their use of the systems and different combinations of these clusters are used to produce different benchmarks. By varying the resources used by a member of each cluster type and the total number of users, a Monte Carlo method method can be used to generate system loading and hence statistics. She presented in graphical form some of the results.

    So far, she had modelled the ATLAS WGS load and she will now try this for other services, refining the command definitions and comparing different platforms.

  17. Interactive Services at CCIN2P3 by W.Wojcik/CCIN2P3
    This talk was concerned only with the SIOUX and BAHIA clusters at CCIN2P3 although they share file and tape services with the batch clusters also. BAHIA consists of 4 HP J200 series nodes and 4 RS/6000 model 390s in the CCINI2P3 SP2. It is intended for batch job preparation and general interactive work. They investigated response times by running test scripts and they found that HP model 735s become overloaded at 15 users logged-in but moving to J200 systems has moved this limit up to at least 50 users without degradation.

    SIOUX has 2 HP 735s and 4 RS/6000s, again from the SP2 configuration, and is intended for users with heavier CPU tasks. In fact BAHIA uses nodes of SIOUX for large CPU jobs. Like BAHIA it uses load sharing but based on a simple round-robin sharing algorithm where the user gives a generic name (BAHIA or SIOUX) and the name server allocates a particular node name to the connection request.

    PIAF has been ported to run on the SP2 using IBM's parallel file system for data access. They needed to develop a group quota scheme to use this parallel file system. It is in use by 6 user groups. The conclusion so far is that the PIAF/SP2 arrangement is well suited to data mining tasks and that the parallel file system offers an interesting speed-up in a completely transparent way.

  18. Roundtable Discussion on Security Around HEP Sites led by L.Cons/CERN
    This was a discussion on computer security around the different HEP sites led by Lionel Cons, convenor of the HEPiX Security Working Group which was established after the HEPiX Prague meeting. Since then there has been little activity however. The session started with some site reports.

    From the above it was clear that things are indeed changing for the better vis-a-vis security with much more awareness now of the risks and possibilities, at least in the larger labs. AFS in particular makes security that much more important. For example how to protect sensitive files normally stored in the home directory, should they be partially open to HEP sites or completely protected.

    During the discussion it was proposed that each site appoint a named security contact whose name could be included on a mailing list. The question was raised if HEPiX should or even could define a recommendation for a default security policy and if it did how that could be implemented. Could one achieve a secure HEP collaboration over the Internet?

    Other subjects raised included

  19. Managing a local copy of the ASIS repository by Philippe Defert/CERN
    For people who do not already know, ASIS is CERN's central repository of public domain UNIX utilities and CERN-provided tools and libraries. Programs and packages are "cared for" by a local product "maintainer" according to specific levels of support ranging from full support to acting merely as a channel to the original developer. There are 600 products/packages contained in some 14GB of disc space and 10 architectures are represented although some such as ULTRIX and SunOS will be frozen at the end of 1996. Clients can access ASIS via AFS, NFS or DFS (although external DFS access was not yet available).

    Products "appear" to clients as if they sit in /usr/local/... although in fact many are actually links and there is a regular update procedure (ASISwsm) which normally runs nightly on client nodes to update these links. This procedure is tailorable by a local system administrator. Another utility, ASISsetup, declares which version of which products should be used on a particular node, similar in function to FNAL's UPS/UPD procedures. However, enabling access to multiple versions at a time requires extra effort from the maintainer and this is not possible for all packages.

    Product maintainers have a certain number of tools at their disposal: to build a new version across all supported architectures with one command; for testing; and for public release of new versions. Products passed through various states - under test, under certification, release - and Philippe illustrated the flow of modules and the directory hierarchy used.

    To offer ASIS at remote sites, a replication scheme has been developed which was usually run daily. It checked for the presence on the CERN master repository of new products or new versions of products, thus minimising file transfers. This extended to checking when the change was only a state change (for example, only moving from certified to release state) in which case the files on the remote site do not required to be updated, only their state and/or directory placement. The ASIS local copy manager tool was currently being updated to take account of the latest ASIS structure changes. At the time of the meeting some 6 remote sites were mirroring ASIS locally.

    Product state changes and releases are defined as "transactions" such that if a transaction does not complete fully (removal of an old version for example, followed by release of a new one), then some recovery is attempted and warning messages issued.

  20. ELLIOT : An archiving System for Unix by Herve Hue/CCIN2P3
    CCIN2P3 felt the need for a UNIX file archiving scheme to replace VMarchive on their VM system, to be ready by mid-97 when the mainframe is due to be stopped. It should have both a command line and graphical user interface and be based on ADSM, the agreed file backup utility. ADSM in its "natural" state was rejected as the archive tool after some tests because not enough information about the file was stored, there was poor authorisation control and the command line and GUI actions stored different items! There was also a suspicion that ADSM, at least the then-current version, suffered from a Year 2000 problem - it could not accept an expiry date beyond 2000.

    The chosen solution was to add a client part "above" ADSM using ORACLE to store the required extra meta-data about the files and to build a MOTIF interface to this. XDesigner was used to build this interface. The client code on each node communicates to an ELLIOT server demon which itself communicates to ADSM and ORACLE. The ELLIOT server must reside on a node with access to the files it is to archive, normally the file server itself.

    ELLIOT offers powerful query facilities and if only meta-data is requested then only ORACLE queries are issued, ADSM is only called if the file itself must be accessed. Further, the query calls are contained in a library such that eventually another database could be substituted, even another backup product to replace ADSM.

    The speaker showed examples of both the graphical and line mode interface to ELLIOT. Man pages and online help from both the GUI and line mode versions already exist. The program was then in beta-test. In answer to a question, Herve stated that the files could be fully accessible even without the ORACLE meta-data since sufficient archive data was stored in ADSM as well.

  21. The DESY UNIX User registry System by S. Koulikov/DESY
    There are many clusters at DESY, AFS and non-AFS, UNIX and non-UNIX (e.g. Novell). A scheme has been developed to have a single user registry for at least the UNIX clusters based on DCE registry and QDDB, a public domain package from the University of Kentucky. Both databases contain almost the same information, DCE being the master, and nightly consistency checks are performed. QDDB is normally used for quick information searches.

    The normal registry administrators and the DESY User Consultancy Office can register and modify users' information and group administrators can perform certain maintenance tasks, for example modify a password or space quota. The user can modify his/her own password, choice of shell, mail address and default printer.

    The speaker described the 2 databases in some detail and how the information flows when a registry entry is made or altered. He also showed the graphical interface.

    This is seen as a first step in the migration to DCE when the plan is currently seen as every user having a DCE account from which accounts would be generated on particular clusters or services.

  22. A Scalable Mail Service - LAL Experience with IMAP by M.Jouvin/LAL
    Since some time LAL has been largely based on VMS mail, accessed via dumb terminals or a Motif interface. PC and MAC users were told to use terminal emulators to the VAXes and mail from there. UNIX mail was not supported. As the use of VMS has decreased and that of UNIX has risen, a good UNIX solution was needed, including a native GUI for PC and MAC users and support for PPP access from home. Other requirements included

    X.400 was considered and rejected - largely because the address format has no fit with the sendmail protocol, because X.400 is difficult to manage and because there are too few clients.

    POP was rejected, despite its nice user agents, because

    LAL finally selected IMAP 4 (based on RFC 1730) as the basic mail protocol. Messages are stored on a server and accessed by clients. The so-called IMAP consortium drives the standard and its extension ICAP (the Internet Calendar Access Protocol). For IMAP, more and more clients and servers are becoming available, both public domain and commercial. Among the promised servers are Netscape mail server, Microsoft Exchange (both due end '96) and SUN Solstice.

    IMAP supports MIME, it has an efficient header-only search scheme and it is optimised for low speed links. Many mailbox management schemes are possible including ACLs and quotas. The protocol (ICMP) will evolve to ACAP (Application Configuration Access Protocol), which is expected to be ready at the end of 1996. This will help with the configuration of address books and should permit common and similar access for address books as for mailboxes. ACAP should also extend IMAP to multiple mail servers with load sharing and replication.

    The CMU/Cyrus project suite (also known as the Andrew II Project) is effectively IMAP+IMSP. LAL runs this on a pair of high-availability Alphaservers with automatic failover. The Cyrus suite supports Kerberos login authentication and has a TCL-based management tool.

    As clients they offer one or more on every platform, even on VMS, PC and MAC although on PC the choice is commercial (Netscape 4). At that time, only one tool was available on all platforms, Simeon (from a small Canadian company). This happens to be the only one today supporting ACAP although Pine is expected to do so shortly and others later.

    The talk closed with a review of the features of Pine and Simeon and a list of URL references.

  23. X11 problems: how the HEPiX X11 scripts can help you by Lionel Cons/CERN
    Lionel was convinced that in general debugging X11 is hard: he illustrated this by showing examples of the rich variety of client/server combinations, the many inter-linked libraries used by the applications, some of the many environment variables, keyboard mappings, etc. Then there is a window manager, a session manager, and so on. Often even performing a "simple" task can be non-trivial - changing the cursor size or selecting a particular font for one application were just two examples.

    X11 is resource-hungry, both in CPU, memory and network traffic, especially if it is badly configured or contains "unfriendly" X11 application (Xeyes is a trivial example, or a pretty pattern screen saver if not run on the X server itself). Lastly, X11 has several inherent security concerns.

    From all of this we can say that X11 needs

    This was the driving force for the development of the HEPiX X11 scripts. The project started as a joint effort by DESY, CERN and RAL with most of the actual development being done at CERN. They are now widely deployed at CERN and gradually elsewhere (e.g DESY, SLAC).

    For the future, Lionel and his X11 team are looking at

  24. HEPiX X11 Working Group Report by Lionel Cons/CERN
    The meeting, the day before this HEPIX conference, had consisted of various site reports, technical reviews of the current state and open problems and the planning of future work. A list of proposed enhancements was agreed as well as a list of topics to be investigated in the future by the working group. The minutes from this meeting can be found on the web at URL http://wwwcn.cern.ch/hepix/wg/X11/www/Welcome.html.

  25. HEPiX and NT, a Roundtable Discussion led by John Gordon/RAL
    John introduced the subject by describing RAL's Windows plans - nothing on Windows 95, go directly to NT; it is expected over time to become the desktop system of choice. At that moment work was just beginning on an NT infrastructure at RAL where, although the UNIX base was rising, the interest in NT was rising even faster. The HEP processing model indicated a potential wide use of PC/NT in Monte Carlo production where the price as well as the price/performance could possibly solve the need for massive CPU power for LHC experiments at a price they could afford.

    He posed the question - should these PCs run NT, or Linux or Solaris - and does it matter? HEP needs good Fortran, C and C++ and these are all already there with more coming. For NT there are several options for NFS access (including Samba), AFS and DFS are announced, there are several NQS ports and a beta of LSF. RAL is porting their tape software (VTP) and Sysreq to NT. And remote access to NT is possible with telnet, WinDD, Wincenter, Ntrigue, etc.

    WOMBAT was a trial NT farm for HEP. It consists of 5 Pentium Pro systems with a local file server, connection to the ATLAS file store and Ntrigue for remote access. It was used to demonstrate several ports and provide a platform for experiments to try, including benchmarking. It offered an alternative to UNIX for groups now on VMS which was being rundown. It was hoped to have the first results by the end of 1996.

    Areas which John felt need to be addressed include

    RAL would be interested in involving other labs, perhaps a HEP-wide effort? They believed that portability was more important than uniformity; and sharing experiences, and mistakes, was very important.

    CERN had an approved project (RD47) evaluating the use of PC/NT. The LHC experiments were interested in NT and there were a certain number of individual initiatives happening around the lab.

    DESY has established an NT Interest Group including representatives of the Computer Centre, major user groups, and Zeuthen. The largest effort was that of the ZEUS experiment which plans to base itself on NT. They have ordered 20 Pentium Pro systems. On the other hand, there are no current plans for an NT-based desktop.

    CASPUR is heavily into PCs but rather under Linux as has been reported at this conference and elsewhere. Windows is only used for secretarial work. CASPUR will investigate the NT/AFS client (as will CERN).

    FNAL, in the progress of migrating away from VMS, is recommending administration workers to consider PC and MAC rather than UNIX which is the recommended platform for physicists. Some NT servers are beginning to appear on the site.

    NIKHEF have bought some PCs for evaluation. LAL have one NT server, dedicated to a Wincenter service and GSI have several of these. Prague Uni have one PC for work on Objectivity.

    The session closed with a discussion on how to keep up-to-date with this area and the suggestions included

    and various people undertook to investigate these.

Alan Silverman, 27 February 1997