NEC 1000 Series Brochure & Specs page 6

Express5800/1000 series

Hide thumbs

page of 12

/ 12
Bookmarks

Mainframe-class RAS Features

RAS Design Philosophy

Realization of a mainframe-class continuous operation through the pursuit of

reliability and availability in a single server construct

Generally, in order to achieve reliability and availability on an

open server, clustering would be implemented. However,

clustering comes with a price tag. To keep costs at a minimum,

the Express5800/1000 series servers were designed to

achieve a high level of reliability and availability, but within a

single server.

The Express5800/1000 series server's powerful RAS features

were developed through the pursuit of dependable server

technology.

Dependable Server Technology

Continuous operations through failures

Redundant components, error prediction and error

correction allows for continuous operation

Minimized spread of failures

Technology to minimize the effects of hardware failures on

the system. Reduction of performance degradation and

Smooth recovery after failures

Ability to replace failed components without

The Dual-Core Intel

(Machine Check Architecture)

The framework for hardware, firmware and OS error handling

The Dual-Core Intel

Itanium

processor, designed for high-end

enterprise servers, not only excels in performance, but is also

abundant in RAS features. At the core of the processor's RAS

feature set, is the error handling framework, called MCA.

MCA provides a 3 stage error handling mechanism – hardware,

firmware, and operating system. In the first stage, the CPU and

chipset attempt to handle errors through ECC (Error Correcting

Code) and parity protection. If the error can not be handled by

the hardware, it is then passed to the second stage, where the

firmware attempts to resolve the issue. In the third stage, if the

error can not be handled by the first two stages, the operating

system runs recovery procedures based on the error report

and error log that was received. In the event of a critical error,

the system will automatically reset, to significantly reduce the

possibility of a system failure.

Clustering

multi-node shutdown

shutting down operations

Itanium

processor MCA

Continuous operations throughout failures; minimize the

spread of failures; and smooth recovery after failures were

goals set forth which lead to implementation of technologies

such as memory mirroring, increased redundancy of intricate

components, and modularization. Through these technologies

a mainframe level of continuous operation was achieved.

Reliability

Mainflame

Center

No chipset on the center plane

Level

plane

ECC protection of main

data paths Intricate error

Chipset

detectionof the high-

speed interconnects

Clock

Core I/ O

Conventional

open server

PCI card

Level

ECC protection

Memory

SDDC Memory

C PU

Intel

Cache Safe

Technology*

L3 cache

Power

PC Server

H DD

Level

*1 Available only on the 1320Xf/1160Xf

*2 Available only on the 1320Xf

*3 Intel

technology designed to avoid cache based failures

*4 Replacement of failed component without shutting down other partitions.

Application Layer

Operating System

The OS logs the error, and then starts the recovery process

Firmware

Seamlessly handles the error

Hardware

CPU and chipset ECC and parity protection

The Firmware and OS aid in the correction of complex platform errors to restore the system

Error details are logged, and then a report flow is defined for the OS

Detects and corrects a wide range of hardware errors for main data structures

Availability

Serviceability

Partial chipset degradation/

Hot Pluggable

Dynamic recovery

Duplexed*

Hot Pluggable

16 processor domain

segmentation*

Hot Pluggable

Core I/O Relief

Hot Pluggable

Memory

Mirroring*

N+1 Redundant

Hot Pluggable

Two independent

power sources

Software RAID

Hot Pluggable

Hardware RAID

This manual is also suitable for:

1080rf 1320xf/1160xf 5800 series Express5800/1080rf Express5800/1160xf Express5800/1320xf

NEC 1000 Series Brochure & Specs page 6

Related Manuals for NEC 1000 Series

Related Products for NEC 1000 Series

This manual is also suitable for: