25. 05. 2009 OS patching and reconfiguration of some X4500s

All machines have difficulties in getting new patches

t3fs01 (which already had been patched several times last year) shows

smpatch analyze
Failure: Response code was 403

The permission denied problem disappeared later on without any intervention on my side.

OS patching

t3fs06 first needed to be registered, which worked ok. Afterwards the smpatch utility showed that I needed to install an update to the patch functionality itself:

smpatch update -i 121119-15

running smpatch analyze again, yielded: Show Hide

140129-05 SunOS 5.10_x86: bnx patch
122213-32 GNOME 2.6.0_x86: GNOME Desktop Patch
138765-01 SunOS 5.10_x86: acc driver patch
139990-01 SunOS 5.10_x86: wtmpfix patch
119253-31 SunOS 5.10_x86: System Administration Applications Patch
124631-25 SunOS 5.10_x86: System Administration Applications, Network, and Core Libraries Patch
120200-15 SunOS 5.10_x86: sysidtool Patch
124629-10 SunOS 5.10_x86: CD-ROM Install Boot Image Patch
120544-14 SunOS 5.10_x86: Apache 2 Patch
139521-02 SunOS 5.10_x86: package specific [ir].manifest removal patch
122912-15 SunOS 5.10_x86: Apache 1.3 Patch
125556-03 SunOS 5.10_x86: patch behavior patch
140797-01 SunOS 5.10_x86: umountall patch
140900-01 SunOS 5.10_x86: [ir].manifest patch
141017-01 SunOS 5.10_x86: Dummy Patch
139556-08 SunOS 5.10_x86: Kernel Patch
140090-02 SunOS 5.10_x86: mount, quota and libmapid.so.1 patch
119784-10 SunOS 5.10_x86: bind patch
126869-03 SunOS 5.10_x86: SunFreeware bzip2 patch
120273-25 SunOS 5.10_x86: SMA patch
123896-10 SunOS 5.9_x86 5.10_x86: Common Agent Container (cacao) runtime upgrade patch 10
140120-01 SunOS 5.10_x86: bge patch
140091-01 SunOS 5.10_x86: intel-ucode.txt patch
121082-08 SunOS 5.10_x86: Disable Transport Agentry for Sun Update Connection Hosted EOL
140566-02 SunOS 5.10_x86: kaio patch
140564-01 SunOS 5.10_x86: ptsl patch
140409-03 SunOS 5.10_x86: nv_sata/sata driver patch
140158-02 SunOS 5.10_x86: st driver patch
140164-01 SunOS 5.10_x86: /kernel/sys/sparcv9/semsys patch
140119-06 SunOS 5.10_x86: sshd patch
140097-03 SunOS 5.10_x86: libpicldevtree.so.1 patch
138175-01 SunOS 5.10_x86: igb driver patch
140906-01 SunOS 5.10_x86: sha256, sha512 patch
140127-01 SunOS 5.10_x86: passwdutil.so.1 patch
140558-01 SunOS 5.10_x86: libldap.so.5 patch
140108-01 SunOS 5.10_x86: nss_nisplus.so.1 and nss_ldap.so.1 patch
140168-01 SunOS 5.10_x86: libsldap.so.1 patch
140392-02 SunOS 5.10_x86: nscd patch
140141-01 SunOS 5.10_x86: libproc patch
140910-01 SunOS 5.10_x86: libaio.so.1 patch
140787-01 SunOS 5.10_x86: nss_files.so.1 patch
140130-06 SunOS 5.10_x86: pam_krb5.so.1 patch
121119-16 SunOS 5.10_x86: Sun Update Connection System Client 1.0.10
140084-01 SunOS 5.10_x86: sh patch
138867-02 SunOS 5.10_x86: sharetab patch
138767-01 SunOS 5.10_x86: ldap-client manifest patch
139997-02 SunOS 5.10_x86: i.rbac and patch postinstall patch
140965-01 SunOS 5.10_x86: umountall patch
140094-01 SunOS 5.10_x86: acctctl patch
140093-01 SunOS 5.10_x86: touch patch
140915-01 SunOS 5.10_x86: cpio patch
140913-01 SunOS 5.10_x86: ufsrestore patch
140908-01 SunOS 5.10_x86: fsstat patch
140861-01 SunOS 5.10_x86: su patch
140114-01 SunOS 5.10_x86: mkfs and newfs patch
140112-02 SunOS 5.10_x86: format patch
140111-02 SunOS 5.10_x86: libnisdb patch
140172-01 SunOS 5.10_x86: ksh,sh,pfksh,rksh,xargs patch
140075-01 SunOS 5.10_x86: zoneinfo patch
140560-01 SunOS 5.10_x86: cron patch
141053-01 SunOS 5.10_x86: inetd patch
140140-02 SunOS 5.10_x86: cryptmod patch
138650-01 SunOS 5.10_x86: i.renamenew r.renamenew patch
119314-26 SunOS 5.10_x86: WBEM Patch
119281-19 CDE 1.6_x86: Runtime library patch for Solaris 10
124394-09 CDE 1.6_x86: Dtlogin smf patch
119279-27 CDE 1.6_x86: dtlogin patch
140405-01 SunOS 5.10_x86: lockstat patch
126366-14 SunOS 5.10_x86: CDE Desktop changes - Solaris Trusted Extensions
139609-02 SunOS 5.10_x86: Emulex-Sun LightPulse Fibre Channel Adapter driver
140785-01 SunOS 5.10_x86: pmap patch
138648-01 SunOS 5.10_x86: /usr/bin/dircmp patch
119118-49 Evolution 1.4.6_x86 patch
139605-03 SunOS 5.10_x86: Sun Fibre Channel Device Drivers
125540-06 Mozilla 1.7_x86: Mozilla Firefox Web browser
119202-34 SunOS 5.10_x86: OS Localization message patch
125333-05 JDS 3_x86: Macromedia Flash Player Plugin Patch
119402-10 SunOS 5.10_x86: Patch for Western Europe Region locale issues
122423-04 SunOS 5.10_x86: add missing locale files for Mozilla
140400-01 SunOS 5.10_x86: in.ftpd patch
120740-05 GNOME 2.6.0_x86: GNOME PDF Viewer based on Xpdf
119907-13 Gnome 2.6.0_x86: Virtual File System Framework patch
119539-18 GNOME 2.6.0_x86: Window Manager Patch
139100-02 SunOS 5.10_x86: gtar patch
139981-03 SunOS 5.10_x86: md patch
120411-30 SunOS 5.10_x86: Internet/Intranet Input Method Framework patch
136883-02 SunOS 5.10_x86: ImageMagick patch
119255-65 SunOS 5.10_x86: Install and Patch Utilities Patch
125062-05 Message Queue 3.7 UR2 Patch 2_x86 SunOS 5.9 5.10 Core product
119091-32 SunOS 5.10_x86: Sun iSCSI Device Driver and Utilities
140095-03 SunOS 5.10_x86: ixgbe patch
118668-20 JavaSE 5.0_x86: update 19 patch (equivalent to JDK 5.0u19)
118669-19 JavaSE for business 5.0_x86: update 18 patch (equivalent to JDK 5.0u18), 64bit
119214-19 NSS_NSPR_JSS 3.12.3_x86: NSPR 4.7.4 / NSS 3.12.3 / JSS 4.3
119704-12 SunOS 5.10_x86: Patch for localeadm issues
119964-14 SunOS 5.10_x86: Shared library patch for C++_x86
120754-06 SunOS 5.10_x86: Microtasking libraries (libmtsk) patch
140791-01 SunOS 5.10_x86: ldap patch
140105-02 SunOS 5.10_x86: usr/bin/printf patch
121431-37 SunOS 5.8_x86 5.9_x86 5.10_x86: Live Upgrade Patch
121429-12 SunOS 5.10_x86: Live Upgrade Zones Support Patch
125732-04 SunOS 5.10_x86: XML and XSLT libraries patch
119247-35 SunOS 5.10_x86: Manual Page updates for Solaris 10
121309-16 SunOS 5.10_x86: Solaris Management Console Patch
125953-18 Sun Java Web Console 3.1[_x86]
140123-01 SunOS 5.10_x86: metastat and mdmonitord patch
140106-01 SunOS 5.10_x86: usr/sbin/rpc.metad patch
138176-02 SunOS 5.10_x86: mega_sas patch
119316-15 SunOS 5.10_x86: Solaris Management Applications Patch
140104-01 SunOS 5.10_x86: usr/lib/snmp/mibiisa patch
121621-04 SunOS 5.10_x86: Patch for mediaLib in Solaris
139603-01 SunOS 5.10_x86: multipathing patch
140388-01 SunOS 5.10_x86: statd patch
140789-01 SunOS 5.10_x86: nfsd, nfs4cbd, lockd patch
140109-02 SunOS 5.10_x86: nfssrv patch
140911-01 SunOS 5.10_x86: nge driver patch
140166-01 SunOS 5.10_x86: ldap_cachemgr patch
140793-01 SunOS 5.10_x86: usr/lib/nis/nisupdkeys patch
140124-01 SunOS 5.10_x86: kernel/drv/dnet patch
139290-01 SunOS 5.10_x86: pgadmin3 patch
123591-10 SunOS 5.10_x86: PostgresSQL patch
140836-01 SunOS 5.10_x86: package specific [ir].rbac removal patch
138827-03 SunOS 5.10_x86: PostgreSQL 8.3 core patch
138823-03 SunOS 5.10_x86: PostgreSQL 8.3 documentation patch
140840-01 SunOS 5.10_x86: pg_upgrade.sh patch
140107-01 SunOS 5.10_x86: sppp driver patch
119789-09 Synopsis: SunOS 5.10_x86: Sun Update Connection Proxy 1.0.9
140086-01 SunOS 5.10_x86: ata patch
140121-01 SunOS 5.10_x86: usr/lib/lp/bin/netpr patch
139607-01 SunOS 5.10_x86: Qlogic ISP Fibre Channel Device Driver
138646-01 SunOS 5.10_x86: dscpmk patch
140160-01 SunOS 5.10_x86: rsh/rlogin/rcp/rdist patch
140131-01 SunOS 5.10_x86: rds patch
122676-02 SunOS 5.10_x86: SunFreeware samba man pages patch
140862-01 SunOS 5.10_x86: libdiskmgt patch
119758-14 SunOS 5.10_x86: Samba patch
119961-05 SunOS 5.10_x86, x64, Patch for profiling libraries and assembler
120186-17 StarOffice 8 (Solaris_x86): Update 12
136709-01 SunOS 5.10_x86: Service Tags patch
139775-01 X11 6.6.2_x86: xfs stfsloader patch
140170-01 SunOS 5.10_x86: tftp patch
125542-04 Mozilla 1.7_x86: Mozilla Thunderbird email client
140083-01 SunOS 5.10_x86: tnrhdb patch
140092-01 SunOS 5.10_x86: ehci, uhci, scsa2usb and hidparser patch
140089-01 SunOS 5.10_x86: vold patch
139657-02 SunVTS 7.0: Patch Set 5 for Solaris 10_x86
140133-01 SunOS 5.10_x86: xge patch
125720-28 X11 6.8.0_x86: Xorg server patch
119060-45 X11 6.6.2_x86: Xsun patch
126364-07 SunOS 5.10_x86: X Window System changes - Solaris Trusted Extensions
120095-22 X11 6.6.2_x86: xscreensaver patch
140102-01 SunOS 5.10_x86: rpc.ypupdated patch
141105-01 SunOS 5.10_x86: ZFS Administration Java Web Console Patch

Fixing disk label errors and setting up RAID mirroring on the boot disk

When starting up, I see

May 25 11:28:55 t3fs06 scsi: WARNING: /pci@0,0/pci1022,7458@1/pci11ab,11ab@1/disk@1,0 (sd13):
May 25 11:28:55 t3fs06  primary label corrupt; using backup
May 25 11:28:55 t3fs06 scsi: WARNING: /pci@0,0/pci1022,7458@1/pci11ab,11ab@1/disk@2,0 (sd18):
May 25 11:28:55 t3fs06  primary label corrupt; using backup
May 25 11:28:55 t3fs06 scsi: WARNING: /pci@0,0/pci1022,7458@1/pci11ab,11ab@1/disk@4,0 (sd30):
May 25 11:28:55 t3fs06  primary label corrupt; using backup
May 25 11:29:02 t3fs06 scsi: WARNING: /pci@0,0/pci1022,7458@1/pci11ab,11ab@1/disk@4,0 (sd30):
May 25 11:29:02 t3fs06  primary label corrupt; using backup
May 25 11:29:06 t3fs06 scsi: WARNING: /pci@0,0/pci1022,7458@1/pci11ab,11ab@1/disk@1,0 (sd13):
May 25 11:29:06 t3fs06  primary label corrupt; using backup
May 25 11:29:10 t3fs06 scsi: WARNING: /pci@0,0/pci1022,7458@1/pci11ab,11ab@1/disk@2,0 (sd18):
May 25 11:29:10 t3fs06  primary label corrupt; using backup

entries on sd18 from dmesg:

May 25 11:28:55 t3fs06 scsi: [ID 193665 kern.info] sd18 at marvell88sx0: target 2 lun 0
May 25 11:28:55 t3fs06 genunix: [ID 936769 kern.info] sd18 is /pci@0,0/pci1022,7458@1/pci11ab,11ab@1/disk@2,0
May 25 11:28:55 t3fs06 scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci1022,7458@1/pci11ab,11ab@1/disk@2,0 (sd18):

Using hd -w we can map the PCI path to the normal disk names:

c0t1 = /pci@0,0/pci1022,7458@1/pci11ab,11ab@1/disk@1,0 (sd13)
c0t2 = /pci@0,0/pci1022,7458@1/pci11ab,11ab@1/disk@2,0 (sd18)
c0t4 = /pci@0,0/pci1022,7458@1/pci11ab,11ab@1/disk@4,0  (sd30)
All disks sit on the same controller /devices/pci@0,0/pci1022,7458@1/pci11ab,11ab@1

When looking on the partitioning information with hd -a, almost all disks except the boot disks were labelled "none" in the fdisk type information. c0t0d0p0 was labelled as LinuxNative Solaris (probably somebody had done some test installs on this machine). I set up ZFS on all disks (using our t3 ZFS script), and afterwards all disks were labelled as "EFI".

There was no RAID mirroring of the boot disk!

prtvtoc /dev/dsk/c6t0d0s2

* /dev/dsk/c6t0d0s2 partition map
* Dimensions:
*     512 bytes/sector
*      63 sectors/track
*     255 tracks/cylinder
*   16065 sectors/cylinder
*   60800 cylinders
*   60798 accessible cylinders
* Flags:
*   1: unmountable
*  10: read-only
*                          First     Sector    Last
* Partition  Tag  Flags    Sector     Count    Sector  Mount Directory
       0      2    00      16065  16225650  16241714   /
       1      3    01   16241715  16065000  32306714       # this is swap
       2      5    00          0 976719870 976719869
       3      0    00   32306715     32130  32338844
       4      0    00   32338845     32130  32370974
       5      4    00   32370975  16065000  48435974   /usr
       6      7    00   48435975  16257780  64693754   /var
       7      0    00   64693755 912026115 976719869   /opt
       8      1    01          0     16065     16064

prtvtoc /dev/dsk/c6t0d0s2 | fmthard -s - /dev/rdsk/c6t4d0s2
fmthard: Partition 0 overlaps partition 2. Overlap is allowed
        only on partition on the full disk partition).

The problem is that the two disks are not of the same fdisk type:

hd -a

Device    Serial        Vendor   Model             Rev  Temperature Type
------    ------        ------   -----             ---- ----------- ----

c6t0d0p0  F402P6G3N3EF  ATA      HITACHI HUA7250S  A90A None      Solaris2
c6t4d0p0  F402P6G3N3TF  ATA      HITACHI HUA7250S  A90A None      EFI

Very good article on using format: http://www.sun.com/bigadmin/content/submitted/format_utility.jsp

We need to use format with the -e (extended) option to

format -e
Searching for disks...done

      28. c6t4d0 

Specify disk (enter its number): 28
selecting c6t4d0
[disk formatted]

format> current
Current Disk = c6t4d0


format> label
[0] SMI Label
[1] EFI Label
Specify Label type[1]: 0
Warning: This disk has an EFI label. Changing to SMI label will erase all
current partitions.
Continue? y
Auto configuration via format.dat[no]?
Auto configuration via generic SCSI-2[no]?

The current rpm value 0 is invalid, adjusting it to 3600
You must use fdisk to delete the current EFI partition and create a new
Solaris partition before you can convert the label.

I then used fdisk to delete partition 1 and recreate a "Solaris2" partition.

Should this become the active partition? y

After this, I was able to correctly copy the disk partition schema from the boot disk

prtvtoc /dev/dsk/c6t0d0s2 | fmthard -s - /dev/rdsk/c6t4d0s2

Initialize the Solaris RAID metadata table on the unassigned slices

bash-3.00# metadb -af -c 2 /dev/dsk/c6t0d0s3 /dev/dsk/c6t0d0s4
bash-3.00# metadb -af -c 2 /dev/dsk/c6t4d0s3 /dev/dsk/c6t4d0s4

metainit -f d10 1 1 /dev/dsk/c6t0d0s0
d10: Concat/Stripe is setup

metainit -f d11 1 1 /dev/dsk/c6t4d0s0
d11: Concat/Stripe is setup

metainit d1 -m d10
d1: Mirror is setup

metaroot d1

# mirror for swap

swap -d /dev/dsk/c6t0d0s1
/dev/dsk/c6t0d0s1 was dump device --
invoking dumpadm(1M) -d swap to select new dump device
dumpadm: no swap devices are available

metainit d20 1 1 c6t0d0s1
 d20: Concat/Stripe is setup

metainit d21 1 1 c6t4d0s1
 d21: Concat/Stripe is setup

metainit d2 -m d20
 d2: Mirror is setup

metattach d2 d21
 d2: submirror d21 is attached

# Edit the file /etc/vfstab and change the line 
/dev/dsk/c6t0d0s1  -       -       swap    -       no      -
# to 
/dev/md/dsk/d2  -       -       swap    -       no      -

# reboot
init 6

# attach the mirror of the root partition
metattach d1 d11

# Copy the boot loader files
installgrub /boot/grub/stage1 /boot/grub/stage2 /dev/rdsk/c6t4d0s0
stage1 written to partition 0 sector 0 (abs 16065)
stage2 written to partition 0, 265 sectors starting at 50 (abs 16115)

# check in the output of metastat whether the mirror partitions are resynching

# set up mirror for /usr
metainit -f d30 1 1 /dev/dsk/c6t0d0s5
 d30: Concat/Stripe is setup

metainit -f d31 1 1 /dev/dsk/c6t4d0s5
 d31: Concat/Stripe is setup

metainit d3 -m d30
 d3: Mirror is setup

metattach d3 d31
 d3: submirror d31 is attached

# change /etc/vfstab entry for the /usr fs to
/dev/md/dsk/d3  /dev/md/rdsk/d3 /usr    ufs     1       no      -

# Same procedure for /var
metainit -f d40 1 1 /dev/dsk/c6t0d0s6
d40: Concat/Stripe is setup
bash-3.00# metainit -f d41 1 1 /dev/dsk/c6t4d0s6
d41: Concat/Stripe is setup
bash-3.00# metainit d4 -m d40
d4: Mirror is setup
bash-3.00# metattach d4 d41
d4: submirror d41 is attached

metainit -f d50 1 1 /dev/dsk/c6t0d0s7
d50: Concat/Stripe is setup
bash-3.00# metainit -f d51 1 1 /dev/dsk/c6t4d0s7
d51: Concat/Stripe is setup
bash-3.00# metainit d5 -m d50
d5: Mirror is setup
bash-3.00# metattach d5 d51
d5: submirror d51 is attached


Two disks (that had been taken out for some time) have lost their labels.

May 26 11:33:50 t3fs05 scsi: [ID 107833 kern.warning] WARNING: /pci@1,0/pci1022,7458@4/pci11ab,11ab@1/disk@1,0 (sd8):
May 26 11:33:50 t3fs05  primary label corrupt; using backup
May 26 11:33:52 t3fs05 scsi: [ID 107833 kern.warning] WARNING: /pci@1,0/pci1022,7458@4/pci11ab,11ab@1/disk@5,0 (sd32):
May 26 11:33:52 t3fs05  primary label corrupt; using backup

# mapping to devices using hd -w
c5t5 = /pci@1,0/pci1022,7458@4/pci11ab,11ab@1/disk@5,0
c5t1 = /pci@1,0/pci1022,7458@4/pci11ab,11ab@1/disk@1,0

I used format, selected the disks, and then used the backup option as suggested by the tool. Thereafter hd -a showed the disks correctly again with the EFI label.

connection and DNS resolve problems

Connecting to t3fs05 via ssh was extremely slow, and t3fs05 was unable to resolve hostnames via the name server. The configuration looked fine. The problem was that the switch already was configured for the bonded interfaces, but the machine only used the e1000g0 interface.

After aggregating the interfaces, everything was fine.

t3fs01 unkillable java process problem and patching

Jobs from a user had left 56 connections in CLOSE_WAIT on the t3fs01_cms pool. I tried to issue mover kill on the matching movers in the pool cell. After the first such command, the pool got stuck. Shutting the dcache down on this fileserver failed to stop the t3fs01 domain process. The process could not be killed even when sending SIGKILL and continued to occupy its resources.

The only way to get rid of the problem was to reboot the system.

The reboot triggered a number of patches which seemingly had been "lying in waiting"... Luckily this did not cause any boot problems

Installing updates
Installing update 128338-02 Succeeded
Installing update 126207-04 Succeeded
Installing update 127889-10 Succeeded
Installing update 127128-11 Failed
Installing update 137293-02 Failed
Installing update 121429-10 Failed         # ( Derek: obsoleted patch)
Installing update 137112-05 Succeeded
Installing update 138071-03 Succeeded
Installing update 138053-02 Succeeded
Installing update 138307-01 Succeeded
Installing update 120273-23 Succeeded
Installing update 138065-03 Succeeded
Installing update 138061-03 Succeeded
Installing update 137022-02 Succeeded
Installing update 138045-02 Succeeded
Installing update 138043-02 Succeeded
Installing update 121005-04 Succeeded
Installing update 137290-01 Succeeded
Installing update 128401-05 Succeeded
Installing update 138309-02 Succeeded
Installing update 128307-05 Succeeded
Installing update 128301-04 Succeeded
Installing update 138091-01 Succeeded
Installing update 138076-02 Succeeded
Installing update 138084-01 Succeeded
Installing update 137020-02 Succeeded
Installing update 123896-04 Succeeded
updating /platform/i86pc/boot_archive...this may take a minute
cannot unmount '/data1': Device busy
svc.startd: The system is down.
syncing file systems... done

