1997.11.24-25 RCF meeting draft memo
1997年11月24-25日 RCF meeting draft mem
97年11月24日、25日にBNLで開催されたRCF Meetingの簡単なメモのド
ラフト版です。 資料はまだもらっていないので、不正確な点、誤り等はご
容赦お願いします。
市原
Reference: http://www.rarf.riken.go.jp/rarf/rhic/rhic-cc-j/link.html
RCF会議の概要
来年8月に予定しているMock data challengeにむけての準備プランが具
体化しつつある。最初にProgress Reportとして、STARはSTAFにobjectivity
を連結して、デモ版が動作したとの報告があった。 PHOBOSではDay 1で確実に
動くfallbackとして、Root flat file+simple HPSSのシステムが検討中。
(PHENIXからの Progress Reportの発表はなかった。)
RCFではMDS(Managed Data Server)にむけてのプロトタイプのディスク、
テープアクセスの評価の報告があったが、プロトタイプがあまりにも力不足な
のでまだ十分評価しきれていない。また, SUN E3000(Dual Ultra SPARC CPU at
250 MHz)でGbit Ethernetのテストがおこなわれ、400Mbpsのところで CPU Free
が0 %となった (2 CPU が完全にIOに使用されてしまった)との報告があった。
Mock data ChallengeにむけてMDSの具体案がしめされた。HPSS(High
Performance Storage System)は現在IBM版しかないので、HPSSサーバ用として
複数のIBMのserver、 SMPとしてSUN Sparc Server, CPU Firmとしてintel
が、互いにswitchで結合されている構成案である。(HPSSは米国で購入した場
合、初年度$300K、次年度以降、年間保守費用が$150Kかかるようで、日本国内
での価格は1$=200円相当の換算額となっている)テープアーカイバーは、現在
StorageTekのRedWoodとSonyのDTFの2案のうちから選考中で来年の2月頃ま
でには決まる見込み。(Sony DTFのメリットは1999年頃にDriverの書き込み速
度が 24MB/sとなる予定であることで、デメリットは現在SonyのDTFはHPSSで
サポートされるTape Deviceにまだなっていない点等)
Farmで使用予定のIntel(Pentium)のOSについて(現在Solaris 2.5.1が使
用されている) Solaris, Linux, NTの比較の議論が行われた。 HEP/NPで広く
使用されているLinux を推奨する意見もあった。 一方, Eventごとに
Objectivity DBの使用が予定されているが、現在 Intel Pentium 上では、
Objectivityが動作するのはWindows NTしかないので、すくなくても来年の夏
のMock data Challengeは、Windows NTをIntel Pentium上で使う可能性が
検討され、UNIXからNTへの移行方法のstudyが開始され報告された。(現在の
ところ、3種類のporting tool 1. NuTCRACKER [DataFocus] 2. OpenNT [Softway
systems] 3. U/Win [1.35])のドキュメントベースでの比較のみの報告で、
それぞれこれから実機で評価する予定とのこと。
(RCF meetingの翌日に、Software MLでObjectivity Data base とRoot DB
の最初のperformance の比較のレポートが報告され、Objectivity DBはROOT DB
に比較して3ー10倍程度Performanceが悪いという結果がでている模様。
ftp://root.cern.ch/root/oobench.ps.gz 参照。 Objectivityのperformance
が懸念される)
26日午後のMeetingのあと行われた Requirement Task forceでは、Phenix
に必要なCPUの量(specfp95)が修正され、Simulationに関係する部分が大幅に
値が大きくなった。(後で別途まとめます)
以下各発表のメモ [ドラフト,未編集です]
----------------------------------------------------------------------
[Introduction (15 min) B. Gibbard]
o Chuck Price がRCFを去る。
o Requirement Task Forceが2ー3ヶ月前に発足
CAS(Central Analysis Server)のCPU, IOはwide rangeであろう。
o Mock Data Challenge Project
1998年8月に Mock Data Challenge Project Iが予定されている
Mock Data Challenge Project IIは1999年1/2月の予定
o HPSS (High Performance Storage System)
HPSSはRHICで大容量のデータをhandlingするための鍵である。
HPSS はIBMの製品で,現在IBM platform上でのみ動作する。
ライセンス料は初年度 $300K, 次年度以降の保守費は年額$150K(高額)
Data Moverの部分はSLACでSUNにportされた。
[STAR (20 min) - T. Wenaus]
o Star Computing Review が97年10月20-22日に開催された。
o STAR MOCK Data Challenge
Off-line Softwareのfull integrationが間もなく完成する。
Geant4のworkshopが97年10月に開催された。
Reconstruction: DST design groupはMDS DSTのdesignを12月に
deliverする。
ObjectivityがSTAFにportされた。デモ版が動作した。
STAF97a のfinal releaseは今週末頃に予定。
SWAT Team : Scope of task force
STAF development, facilities and platformsは以下の通り。
Sun/Solaris, Intel/Solaris, IRIS
[STAR/NERSC Computing (20 min) - C. Tull(voice phone) ]
o High Energy Nuclear Physics computing
o HPSSとObjectivityがSLACで連結された。
o Super computer 97 で STARのevent generationのデモが行われた。
[PHENIX/Japan Computing (20 min) - Y. Watanabe]
o 来年度理研にスーパーコンピュータが導入する予定で、現在資料調査中。で
きればシミュレーションに利用したい。スパコンの規模は0.5Tflops, 10TB HD,
200TB アーカイバ程度。
o APANが進行中で、NSFからいい結果を受け取っている。
[Ichihara補足]
o うまくいけば来年初頭に日米間で 45 Mpbsの国際回線が APANのbackboneと
して開通する予定である。
[RCF Recent and Planned Purchases (15 min) - T. Throwe]
SGI Origin 200 (2 CPU)が11月4日に納品された。
以下の部分を除き、configureされた。
- IRIX 4.x version of CERN Library
-十分なSWAP Space
(外部SCSI portがない。内蔵ディスクを買うか、PCI/SCSI cardを買う。)
RAID Array
100GB RAID5 準備中
[MDS Performance Measurements (15 min) - B. Healy]
MDS(Managed Data Server)のプロトタイプのテスト報告
テスト環境
SUN Ultra Enterprise 3000 server (dual 168MHz Ultra SPARC CPU)
StoprageTeK 9710 Robot + 2 DLT driver
テスト結果 (Jlabs perl scriptを使用)
Disk
write 18MB/sec RAM to disk
read 9 MB/sec disk to RAM
Tape
write 3.28 MB/sec RAID to tape
Read 4.24 MB/sec Tape to RAID
conbind test results
3x Disk Write 29.64 MB/s
3x Disk Read 20.32 MB/sec
結論
現在のプロトタイプは力不足である。
Concurrent Operation時は苦しい。
Disk PerformanceはDisk fullに近付くにつれ落ちる。
write to empty disk 26 MB/s
write to full disk 18 MB/s
performanceの値は +- 1MB/sから3MB/s程度のばらつきがある。
[GBit Performance Measurements (15 min) - T. Healy]
1. Gigabit Ethernet テスト (1000Bses SX multimode-fiber)
目的: High End serverにおける Giga bit EtherのIOに対するCPUの利
用率の測定
テスト環境:Dual 248 Mhz Ultra Sparc/Solaris 2.6 (SUN E3000)
結果:
Ethernet でのデータ転送量が増えるにつれ、リニアーにFree CPUが減少。
400 Mbps時に、CPU Free = 0% となった(I/O処理に全CPUが使用された)
2. Gigabit Long distance テスト
1000Base SX vs 1000BaseLX(single mode)
multimode to single mode media convertersを使用
testing between RCF and 10005-2500m
max 10005 at experiment (Phobias 2812m
using SUN E3000 w?Alteon Switch, NICs
3. その他の活動
switch evaluation+ Prominent
2 Chassis, each with 5 Gigabit port and 20 Fast Ethernet ports
test for truking between swatches, management, aleteon
compatibility
Gigabit for NT
Gigabit for AXI when available - perhaps in time for mock data
[Status of GCE & CRS PPro Cluster (20 min) - J. Flanagan]
User disk moved to new file servers
new SGI origin 200 added to chump
2 x R10000 CPUs, 256MB RAM, IRIX 64
現在のFarm には35 Pentium pro CPU が35個
DQS 3.1.8 released, bug fixed
DQS configuration too simple
Pentium上での現在のOS: Solaris
Solaris 2.5.1 network load problem (Solaris 2.6で解決)
Solaris 2.6 AFSが動作しない。
(Intel/Solaris2.5.1上のAFSはユーザのcontributed softwareで
あり、それを移植した学生がMITを卒業したので、Solaris
2.6/Interl上でのAFSの動作予定は不明)
[PHOBOS (20 min) - M. Baker]
o Phobos Analysis chain test: succeeded in 29 Oct. 1997
Geant 3 -> Hit arrays -> Raw Hit Buffer -> Fake Raw Data file
-> Raw Hit buff -> Hit analysis -> -> (succeeded)
o Mock Data Challenge
Multiprocessor SMT test, Linux Firm test
Critical
day 1 reliabilities & availability
full functionality of existing object model
Desirable
database functionality
Unimportant
long term language flexibility
D+++, Espresso (half cafe decafe with twist)
(言語の長期間の融通性(STAFのWeb HomeページにあるSTAFのうたい文句)は
そんなに重要ではなく、Day 1でのソフトの信頼性、完成度のほうが重要である、
と主張している様子)
PHOBOS Fallback plan
use ROOT-resistenscy mechanics
use flat files & simplest HPSS interface
- advantage
day 1 reliability & availability
full functional of existing object model
complete database functionality available
- disadvantage
incomplete database antabe
Wedded to be C++ (perhaps Java will also work)
Critical Studies
ROOT flat files fallback
How does this interface of interfere with HPSS
ROOT + Objectivities
- Does the elegant scheme work (ccOBJ inheritance)?
we have written from Root to Objy inelegantly
We haven't test reading yet
- Who maintains RCF/Objectibity ?
it stopped working for a while and then magically OK
-Can we get it at MIT ?
"Other" recon, data needs were de-scpped.
- realistic luminosity now assumed
PHOBOS DST size
- Dominated by a core histogram
-necessary for full E-by-E functional analysis
Conclusion
Analysis chain test was successful
Objectivity OK as a goal.
fall back is essential
prefer not to bend over too far backward
We like Linux, NT worth testing
[PHOBOS/MIT LNS Computing (20 min) - M. Baker]
Nov97 SCF configuration
Alpha server 8400, 12 CPU EV5/ 440 MHz
Currently 150 Specint 95
upgrade to 240 Specint95 scheduled for JAN. 98
clock speed upgrade
Linux farm to be added
spring/summer 1998
comparable in size to the SMP
Cumulations
MIT LNS SCF established
A Linux farm will be added
[BRAHMS (20 min) - F. Videbaek]
[OHPの文字が大変小さく全然読めなかった。]
activities
RCF daily experience
DQS needs the improvement with coming version
AFS WinNT (locally)
interactive machine slugging
[Nile (20 min) - R. Popescu]
Description
Hundred of workstation and disk are merged in a wide area virtual computing
system, able to process transparently their requests at most appropriate
locations, recover from failures and parallelism
NILE Object Repository
Store all the persistent objects related to computation to be done by NILE.
Data format defined for NILE and legacy drivers for backward compatibility
Tertiary Storage Manager: Tape and optical
Object Locations service, logical -> D. Stamp physical
Event Selection Engine 2x64 bit -> index
Competitions
DQS, NQS, PVM, CPS
Weaknesses
no fault-tolerance
no job management
no parallelism
Strength
solid product with proven performance
Status and schedule
Nile Fast-Track working for CLOY
Electa/Isas
core-state
6 Alpha Digital UNIX + 100baseeT (Nov97)
Test of the new release, Eelectra/Isas
[MDSsim (20 min) - D. Stampf]
Perpose of MDSsim
Simulate hardware performance
right now, simulate some hardware, into
December goal is to this with existing system
Progress
[MDS Issues & Model (30 min) - C. Price]
Managed Data Server
Tape Storage : SONY or STK
Software HPSS and Objectivity
CPU cycles: SMP in MDS
StorageTeck(STK):
inexpensive Robotics
Inexpensive Media
Expensive Drivers
SONY
Hot technology (DDT)
Inexpensive drivers
Expensive media
Expensive robotocs
Not HPSS supported
Software
HPSS
currently only runs on IBM AIX
Port of "Mover Code" to SUN
Objectivity
likely recommendation of EFT
HPSS
CPU cycles in the MDS
Can we provide cost effective CPU cycles in the MDS ?
What is the impact on the CAS and network access to MDS data ?
Use SUN Enterprise server for analysis.
IBM has not appropriate SMP systems
IBM vs SUN
HPSS favor IBM
must for HPSS server
Disk driver
Network attached is ideal for HPSS design
- few supported devices
- expensive
- defeats to MDS SMP design
RAID vs JBOD
- performance (test favor RAID but buy a fairly small margin)
- reliability (favors RAID but compressed by HPSS cache)
-cost ( favors JBOO but PCI RAID adapter may reduce the penalty )
Mock Data Challenge
----------------------------------------------------------------------
Storagtek Tape Robot
----------------------------------------------------------------------
+--------+ -------+ +-------+ +-------+
| Redwood| |Redwood| |Redwood| |Redwood|
| driver | |driver | |driver | |driver |
+--------+ -------+ +-------+ +-------+
| | | |
+----+ +-----------+ +------+ +-----------+ +----+
|Data| - | IBM F50 | | SUN | | IBM F50 | - |Data|
|disk| | HPSS | | SMP | | HPSS | |disk|
+----+ +-----------+ +------+ +-----------+ +----+
| | |
+--------------------------------+
| Network Switch |
+--------------------------------+
Core software discussion
------------------------------------
STAF with dynamic loading
o why we need it
o how it woks
o other examples
Staf as a framework supposed
Staf as a collaboration wide independent platforms
IRIX, SUN, pc, aix
different computer centers and universities
many years of usage
extremely helpful for debugging
pic (CERNLIB not)
example
o is working in thie way for alomost 2 years
o comon kuip back bone
- easy to export
o currelntly works on 9 unix platforms;
irix 5.4, sunos, sun solaris, alpha, hp9, hp10
aix, pc/solaris, pc/lunux
o some excutable used by STAR, ATLAS, AMS, CHORUS et al
Geant4 future
11/25
[Grand Challenge Status and Plans (30 min) - D. Olson]
Nov. 5-7 Workshop
Goals
improve description of architectural components and interface, some are in
more need than others
discussed
Define some (one or two) complete scenarios from
RHIC Analysis architecture (13 Nov. 1997)
(incomplete figure)
+-----------+ +---------+ +-----------+
| Parallel | |SMQ | | Job |
| Event |----|Queueing |----| Queueing |
| Processing| |Object | | |
+-----+-----+ +---------+ +-----------+
|
|
+ ---- +----------------+
| Objectivity |
+ ---- +----------------+
| |
+-----+-----+ +----------+ +-----------+
|Data Cache |----|Data Mover|------| HPSS Tape |
+-----------+ +----------+ +-----------+
Objectivity "Quick Start" Prototype
events (tracks + event)
tag DB (reference to event)
query monitor (convert event OID to file and sort accordingly)
SAE Irerator (get OID;s a file ,, ddv at a time)
analysis code (bet ack events and tracks, standalone C++)
More Objy Prototype
looking at reading venous into Objy
event + vertex + track
Want to go to class with minimal Objectivity database and other for
-event
-tagDB
-queue monitor
-SAE iiterator
-simple analysis code
Trying to schedule for week of Jan. 26
[[Intel Operating Systems Discussion]]
[Introduction (10 min) - B. Gibbard]
Target commodity Processor (Intel)
Intel Operating systems
UNIX
Solaris - currently deployed , local experience
Linux - widely used in HE NP community
better support some important products
Window NT
Version 4 currently available though by many to lack
required features and robustness
Version 5- project for later calendar '98
RHIC computing migration path options
Intel Solaris -> as for as we can see
Intel Solaris -> windows NT -> as far as we can seen
etc....
[Solaris / Linux Comparison (20 min) - T. Throwe]
Comparison
Solaris Linux
------------------------------------------------------
GNU TOOLS Yes Yes
IDE Yes Yes
CERNLIB No Yes
SMP AFS Yes Yes
DFS No No
HEP/NP No Yes
HPSS No No
Objectivity No No
Good local price Yes Yes
Expertise Yes No
CernLib
Solaris/intel version build locally for 97a, should be able to apply to
98a
AFS/DFS
-Solaris
No AFS support for Solaris 2.6
DFS Support unknown
-Linux
AFS at 3.3a
DFS Support unknown, but HELIX is lobbying
Comments (cont.)
HEP/NP Community
Leverage Sun Solaris experience
- used and abandoned by NEARS (Threads)
- under consideration by SLAC
Commercial software
Both Solaris and Linux are missing commercial software
STAF runs on Solaris/intel
[NT "UNIX" Possibilities (20 min) - R. Popescu] migration tools
Project Goals
Transfer the existing UNIX code to NT with minimal effort
Simple administrative tasks, reusing UNIX shells
Porting choice
Rewrite the application
Translators - mapping of UNIX calls to Win NT calls NNUTCRACKER, U/WIN]
OS enhancements - UNIX environments by extending the POSIX subsystem
UNIXからNTへのSoftwareをTransferするtoolについて以下の3つについて
比較が報告された。(現在のところ、Document Baseの比較なので、これからNT
の実機で評価する予定とのこと)
1. NuTCRACKER [DataFocus]
2. OpenNT [Softway systems]
3. U/Win [1.35]
NuTCRACKER [DataFocus]
-Toolkit of porting utilities and libraries used to optimize code for NT
-Translator maps UNIX call to Win32 calls
-May require modification of source code
specifically for NT --> cannot be run under UNIX
-UNIX command functionally through NKS toolkit
-Direct access to UNIX and Win32 APPIs. Win32 applications and 3rd party
libraries.
NuTCRACKER [DataFocus]
650 UNIX commands and utilities
SCO's Winduf(?) provides mapping of X and Motife
widgets to Win32
Excellent documentation
Princely [$6000/dey.seat] [no license manager]
OpenNT [Softway systems]
Unix environment under Windows NT
[extending posix subsystem]
Keeps NT modification to a minimum
maintains portability
Able to reproduce unrestricted UNIX calls....
OpenNT [ Softway Systems]
Three components
-command and utility (unix run time and shell environment)
-X11R6 server/Motif
-SDK - porting tools
Cheaper (x $100)
Win32 accessibility ?
(cannot talk with objectivities)
The OpenNT architecture
3rd one
U/Win [1.35, Global Technology ltd. inc]
Includes
Libraries to provide UNIX API's
Include files and development tools
ksh and 160 utilities [ ls, ser
....
U/Win [1.35]
NO X graphics
Not complete [ 160 unix utilities]
Free ( for educational and research uses)
Comparison
NuTCRXKER OpenNT U/WIN
Win32 interaction Y ? Y
Development tools Better Good ?
Unix env. Good Good ?
X graphics YES YES no
Price $60000 x$100 Free
Tests [UNIX Reviews, Jan 97 p54}
1. Small C application
2. Large multiprocssing, forks execs
3. Larger client server application !1MB
4. Largest cleient server application, integrated
Results
NuTCRACKER OpenNT
1 ok ok
2 few hours, ok 10 min., ok
3 20 min. ok slow
4 failure failure (gave up after 3 days)
Conclusion
For simple C or script application, OpenNT offers a great price performance
For heavy X and UNIX calls, MUTrRACKER is a better tools
NUTCRAKER is more complete, more user-friendly and more expensive
Status and Schedule
2 NT servers [ a dual and a single PP200] are setup now
C/C++, Fortran compilers are ordered
Open NT evaluation copy in-house
present - end97 --> OpenNT tests
Jan. 98 -> NuTCRACKER tests
OPEN QUESTIONS
Nature of the application
Collabraton-wide development profile
Licensing
References
www.datafocus.com
www.softway.com/OpenNT/home.html
www.reseach.art.com/sw/tools/win
[Linux & NT Farms at DESY (30 min) - M. Baker]
Speaking on behalf of Wolfgang Wander
- head of MIT LNS sci. comp. fac.
- Particle physics by training
- made a successful Linux firms for DESY/HERMES
- Helped DESY/ZEUS convert their failed NT farms to Linux
NT problems
Centralized maintenance difficult
-may be positive with a Unix layer
Poor driver availability for fast ethernet
-Limited to 3.5 Mbps/s
-Still true now (11/97)
Compiler incompatibility abound.
OS Upgrades were not smooth.
Where to get more information
Contact Tobias Haas
He know more about NT side of thing
USE the Web
www.desy.de/minute/access32.html
Conclusion
NT4 worths testing @RCF
-contact experts
-prove that the old problem have been overcome
Don't hold your breath
-Fallback strategy is essential
[NT on the Desk Top (20 min) - T. Wenaus]
Purpose of our desktop Nts
Screen for out many new arrivals
Online development
Off-line development
Motivations
Hardware
SUN Solaris 243Mhz
Gateway 7000 NT server, 2x300 MHz
Software
CORBA
growing NT market
Return to the CCJ home page[D
8 December 1997 T.Ichihara (ichihara@rikaxp.riken.go.jp)