Constellation

Basic Information

  • Service Tag: G6MW3P1 (for new hardware)
  • Express Service Code: 35229763765 (for new hardware)
  • 165.91.232.58
  • One of two load balanced Moodle web servers
  • constellation.cehd.tamu.edu

Student

current name: courses11.cehd.tamu.edu (classes.cehd.tamu.edu)
database: courses11
path: /disks/www/courses11
conf: /etc/httpd/conf.d/ssl.conf

NIC

alias eth0 bnx2
alias eth1 bnx2
alias eth2 igb
alias eth3 igb
alias scsi_hostadapter megaraid_sas
alias scsi_hostadapter1 sata_svw

eth0

  • Broadcom Corporation NetXtreme II BCM5708 Gigabit Ethernet
  • HWADDR=B8:AC:6F:11:3E:1F

eth1

  • Broadcom Corporation NetXtreme II BCM5708 Gigabit Ethernet
  • HWADDR=B8:AC:6F:11:3E:21

eth2

  • Intel Corporation 82576 Gigabit Network Connection
  • HWADDR=00:1B:21:5B:23:14

Constellation ( PE-2970 ) replacement

constellation, 4RSXGM1, 165.91.232.58

  • First call: Around 6/4/2010, Case # 817488819 Timothy W Taylor , Dell sent replacement memory.
  • Second call: case # 822654086 9/12/2010 - System crashed. Dell sent a technician, replacing the system board and memory.
  • Third call: case # 824404145, 10/16/2010 - System crashed during early hours (no or very few users), of itself. Dell sent a technician, replacing one of the CPUs that has the memory controller.
  • Fourth call: case # 824404145, 10/25/2010 - System was stress tested and crashed. Same error messages, E2110 MBE DIMM 7 & 8. Dell supplied a replacement server after this incident.

Constellation Hardware failure on 12/1/2011

Sometime around 3-4am, the server shut down, and was stuck at "[Press] F1 to continue" stage.
Error message: HyperTransport error caused a system reset. Embedded I/O Bridge Device 2. Please check the system event log for details. F1 to continue.
http://support.dell.com/support/edocs/systems/pe2970/en/hom/about.htm

One user says replacement of power supply fixed the problem, after all different fixes had been tried.
http://en.community.dell.com/support-forums/servers/f/946/t/19281276.aspx

This user says he fixed the problem by unplugging a USB keyboard.
http://www.tediosity.com/2011/01/22/dell-2970-crashes-reboots/

The same error also caused a reboot on 6/20/2011. The server is able to stay on.

Dell's resolution

Uploaded dset test result to Dell.
http://dtxdropbox.dell.com/ Login Name: 846962168-dset Password: gvweeqkm
Dell case number: 846962168
Dell contact: Brett Francis DTC91726 - 35027204 Office Hours 7:30 AM to 4:30 PM CST, Monday-Friday Dell Enterprise eSupport
ticket title: 4RSXGM1 / PE 2970 / RH 5 / System Reset (KMM133534092I57L0KM)
The most recent error occurred one time (PCI parity error). Can we update the Perc 6/i controller firmware and clear the logs to monitor this? (power down, pull the power cords, and hold the power button for 20 seconds. This will clear the hardware logs. Or if Dell OpenManage is installed, the logs can be cleared in this application as well.
Perc6/i 6.3.1-0003 firmware update: http://www.dell.com/support/drivers/us/en/2684/DriverDetails/DriverFileFormats?DriverId=W83M2&FileId=2731104181
Server was rebooted and the error log was cleared on 12/9/2011.

Taxonomy: