NEW PCI TECHNOLOGY OPTIMIZES SERVER AND WORKSTATION PERFORMANCE
-----------------------------------------------------------------
Technical White Paper Prepared by: Accton Technology Corp
-----------------------------------------------------------------

Introduction

Since Intel introduced the Peripheral Component Interconnect (PCI)
specification for local-bus microcomputer architectures in 1993, PCI
technology has been taking the personal computer market by storm. Designed
to take advantage of faster PC processors, such as the 80486 and Pentium
chips, PCI promises to deliver 16 times the data throughput of
conventional PC architectures.

Naturally, Ethernet adapter manufacturers have been scrambling to optimize
their own technology to make the best possible use of the PCI local bus
architecture. One of the greatest bottlenecks to network performance has
always been the limitation in getting data on and off the network from a
network server or workstation. Tapping the power of PCI for network
applications could obviate many of these - AN performance problems.

The Evolution of Local Bus Architectures

PCI is a local bus architecture for the personal computer that delivers
higher throughput from input/output (I/O) devices by speeding up the bus
dock and widening the bus data path beyond the 8- and 16-bit set for the
IBM AT architecture. PCI also takes advantage of the full spectrum of
advanced bus features, including interrupt sharing, burst-mode transfers,
and arbitrated bus mastering.

Actually, local bus technology itself is far from new. The old IBM PC and
XT are classic examples of local bus architectures. The whole concept of
local bus design is to directly connect peripheral bus to the central
processing unit (CPU). So plugging a board into a PC s expansion bus would
be the same as directly connecting that board to the CPU. (Naturally, the
same is true for network adapter boards.) In order to make the direct
connection possible, the clock speed and width of the bus must match that
of the CPU. In the case of the XT, the bus width is 8 bits and the clock
speed is 4.77 MHz.

The local bus model can be extended to accommodate faster microprocessors,
as is the case with PCI. In a true local-bus design, the rate at which
components connected to the bus exchange data with the bus circuitry has
to match the speed of the microprocessor. The faster the microprocessor,
the faster the bus, but at no time can the speed of the bus exceed that of
the CPU.

First Came ISA, Then Came EISA

With the introduction of the PC AT, IBM changed the local-bus model by
making it possible for the microprocessor to run faster than the bus. So
as the 286 gave way to the 386, PC and PC clone manufacturers began
implementing different clock speeds, e.g. 286 chips running at 10 or 12
MHz and 386 chips running at 16 MHz or more. By limiting the speed of the
expansion bus, vendors could continue to support existing addin hardware
that had not been optimized for the faster dock rates. The Industry
Standards Association (ISA) finally sanctioned this design approach in
1988.

This ISA architecture not only imposes limitations on the host computers
clock speed, H also meant that the microprocessor (or motherboard) had to
take on the added task of managing transfers to and from the bus. As a
result, valuable CPU cycles were being expended to coordinate RAM reads
and writes, communication with the video adapter and the printer ports,
and other processing tasks. Even with the advent of the bus mastering
design, the separation of dock and bus speeds imposed performance
limitations on adapters.

To obviate this limitation, IBM created the Micro Channel architecture
(MCA), which delivered a 32-bit data path with a 10-MHz clock speed. The
result was the potential for peak transfer rates up to 20 Mbps, versus 8
Mbps for the ISA architecture. What MCA added to the design was a separate
control circuit devoted exclusively to managing the bus, so now the bus
speed and the microprocessor clock speed became truly separated. However,
the MCA architecture is totally incompatible with ISA technology, and the
added cost for MCA has been an added obstacle that has hindered IBM as it
has been trying to move PS/2 computers into the marketplace.

EISA (Extended ISA) applied the same concept, isolating the control of the
bus from the microprocessor and widening the data path to 32 bits. Unlike
MCA, however, EISA is backward-compatible and uses a bus system that is
compatible with both ISA and EISA. Unfortunately, the price of
backward-compatibility is less than optimal performance. The EISA card can
support data transfer rates of up to 33 Mbps, but having to accommodate
ISA's slower 8 Mbps cobbles EISA technology in ordinary operations. The
significant added cost for EISA compared to ISA and the minimal
performance improvement of EISA over ISA have also been factors in the
slow acceptance of Extended ISA.

The local-bus design re-emerged in the early 1990s to support the new
generation of video systems. Early systems were implementing improved
displays on the motherboard and using a local-bus architecture to push
more color and video data through the data pipeline. With the video
display buffer connected right to the microprocessor, it eliminated the
problem of having to slow the video response to match the speed of the
workstation s bus. Instead, the microprocessor was able to manipulate the
host computers images using the full speed and bus width of the CPU.

Even though the need for local-bus video has all but disappeared with the
introduction of fixed-function graphic accelerators to control video frame
buffers, the interest in applications for local bus architectures remains
high. For example, hard drive systems have become faster and are now
straining the throughput rates of conventional ISA bus architectures.

VESA Creates VL-Bus

This resurgence in interest in local bus technology prompted the Video
Electronics Standards Association to develop their own unified local-bus
expansion design, called the VL-Bus. The advantage of the VL-Bus is that
it can handle both 32-bit and 16-bit operations. It was originally
targeted to support video, but the VL-Bus design is broad enough that it
can also support other peripherals that need high-bandwidth data
transfers. And with pin-out connections for local-bus circuits, it s
possible for third parties to develop interchangeable adapters and
expansion boards. The VL-Bus also includes only one hardware interrupt
control (IRQ9), which is used to hook into either an ISA, EISA, or MCA
bus.

The design of the VL-Bus is basically that of a set of buffered address,
data, and control signals that are directly connected to the host
processor. The specification sets no upper limit on clock speed, and
manufacturers can add buffers but only at the sacrifice of speed. The
VL-Bus tops out at a speed of 66 MHz. That's why the recommendation is to
use no more than three local bus devices running at speeds up to 33 MHz;
two devices at 40 MHz, or one device at 50 MHz. Since there are really
only three adapters that can take advantage of a local-bus connection - a
video card, a disk controller, and a network adapter - this design is
adequate.

Unfortunately, software configuration is not possible with VL-Bus adapters
so it is up to the manufacturer to set up jumpers and dip switches, or to
develop their own software to alter on-board EEPROM configurations.
Drivers are not required, except for configuration purposes. And with the
advent of the 64-bit Pentium processor, VESA began working on a new
standard that could tap the Pentium s processing power and still remain
backward compatible with the old 32-bit architecture.

Enter PCI

Unlike other local-bus designs, PCI can accommodate the most advanced
aspects of computer hardware design. It was developed to let designers
create systems capable of supporting the latest in multimedia technology,
encompassing a wide range of peripherals that support data-intensive
applications. For example, PCI supports arbitrated bus mastering which
handles interrupts on a prioritized basis. It also has its own bus command
language and directly supports secondary caches.

PCI was designed for the engineer. The PCI design is independent of the
microprocessor chip, so manufacturers don t have to retrofit their
existing adapter design to accommodate a new microprocessor. It promises a
new standard that will allow adapter makers to create a single set of
cards that will support multiple computing platforms. In fact, the PCI
architecture will not only support different Intel platforms, such as the
486 and Pentium, but it will also support Digital Equipment Corporation's
new Alpha AXP RISC-based technology and the new Power PC.

By following the PCI specification, manufacturers will be able to create
systems that require fewer computer chips. The PCI standard minimizes the
circuitry needed to connect VLSI chips. This means fewer parts and a lower
cost.

All in all, PCI offers the promise of a truly open computing manufacturing
standard with  some real advantages:

* Improved performance through a local-bus architecture

* The ability to use chips to perform specialized functions (video
controllers, SCSI controllers, audio and video products, as well as LAN
adapters)

* The ability to incorporate more functionality on system boards, which
will save costs over installing expansion boards

* A common platform that will let manufacturers create expansion boards
that can accommodate any computer systems - 286, 386, 486, RISC, MCA,
Alpha, etc.

In fact, PCI is expected to supplement rather than replace the bus
architecture in almost all computer systems.

In fact, even though it is a local-bus specification, PCI is not directly
connected to the processor. It is sometimes described as a "mezzanine"
bus, and instead of having its own clock, its bus is synchronized to the
microprocessor clock and its support circuitry. The PCI 2.0 standard
includes support for clock frequencies from 20 to 33 MHz, and it can even
stop the bus speed altogether for additional power savings in green
workstations and notebook computers. As computer technology continues to
save on power consumption, PCI also provides a migration path from 5V
logic circuitry to 3.3V, with connectors provided for both power types.

And like MCA and EISA technology, PCI supports software configuration. PCI
expansion boards include 256 register bytes to store configuration
information.

Better Performance for Network Applications

For network applications local-bus architectures in general and the PCI
specification in particular offer some real pluses in terms of data
throughput performance. According to Intel sources, the theoretical
maximum throughput of PCI is 132 Mbps in 32-bit operation, or 264 Mbps in
64-bn operation.

In real terms, maximum rates are set by burst-mode operations, in which a
single address cycle can be followed by multiple data cycles that access
sequential memory locations. Very few day-to-day computing operations can
take advantage of these burst-mode high-speed data transfers. For
non-burst data transfers the PCI specification requires two dock cycles,
as well as wait states for read operations, so actual throughput is about
half that of burst mode operations, say 66 Mbps for a 32-bit connection.
Still PCI performance far exceeds the established rate of 8 Mbps set for
the ISA bus. The higher data transfer speed can be a real bonus for
high-end workstation applications and file servers. PCI largely eliminates
the bus bottleneck of getting data on and off the network.

And PCI is technically advanced. The first PCI-based machines are now
starting to appear, but the advanced architecture that makes the system
separate from the microprocessor probably means that PCI is here to stay
for some time to come.

The PCI Ethernet Chip Design Offers an Edge

Accton Technology Corporation is starting to develop network adapters
designed specifically to take advantage of PCI. The new generation of
PCI-oriented Ethernet controllers are designed to provide near optimal
data throughput while minimizing CPU utilization. The result: network
adapters that deliver faster throughput rates than ever before.

These new 32-bit PCI Ethernet chipsets typically support PCI clock speeds
of up to 33 MHz with no wait states, and include on-chip DMA (Direct
Memory Access). With on-chip DMA, the adapter can be programmed to
accommodate data bursts of unlimited size, up to full Ethernet bandwidth
(10 Mbps) with less than 1 percent of the bus utilized, and requiring very
little CPU utilization.

What's more, PCI Ethernet chipsets are microprocessor independent, so the
same chip can be used to support 80486, Pentium, or Alpha-based servers
and workstations.

PCI Is Perfect for Network Applications

For network applications, this means that both network servers and high-end
workstations can take advantage of reduced CPU utilization (i.e. fewer
clock cycles are required for network applications, which means more
processing power is available for other tasks). Using DMA technology and
deep FIFO in the chip itself means fewer CPU cycles are required for
network applications. Therefore the computer bus can handle more activity,
which can be particularly useful for network server applications.

The Ethernet PCI chip design also supports full duplexing operations. That
means data can be transferred in and out of a network adapter at full
Ethernet speeds. In other words, data transfer rates of 20 Mbps (10 Mbps
in and 10Mbps out) can be achieved using PCI adapters.

And since PCI technology is software-configurable, manufacturers can create
a single PCI network adapter and offer it for virtually any network or
computer platform. Specific information relevant to the workstation or
server hardware and the network infrastructure can be programmed into the
environment via software. And as the environment evolves, the software can
be upgraded. Unlike other hardware-specific solutions, PCI offers a true
plug-and-play networking solution for virtually any environment.

As a result, PCI technology is ideally suited for network applications.
Data transfer on and off the network is faster and more efficient because
the transfer rates are closely aligned with the clock speeds of the
microprocessor. Full duplex support also makes the most of the available
network connection, delivering data transfer rates of up to 20 Mbps for
Ethernet. lt s also versatile because the PCI architecture is
microprocessor independent and because the specific information needed to
configure the adapter can be provided through external software.

Accton's PCI Networking Technology

Accton is the first manufacturer to offer a PCI-based Ethernet adapter. The
EtherDuo-PCI (EN1203) Ethernet adapter has a full 32-bit data path with
bus mastering. It also uses a DMA Ethernet chip architecture with a deep
FIFO memory to minimize CPU utilization so the system s CPU can handle
increased system bus activity. The microarchitecture supports full duplex
Ethernet operation with data transfer rates up to 20 Mbps.

Since the EtherDuo-PCI uses the PCI local bus architecture, the clock speed
of the bus matches the speed of the computers microprocessor so packet
data can be handled at rates from 50 to 100 MHz, depending on the
microprocessor. And since the chip architecture is processor-independent,
the adapter can support 486, Pentium, Alpha, and RISC-based computers.

The EtherDuo-PCI is an excellent adapter for both servers and
high-performance workstations. lt includes software to support Novell's
recently announced Universal NetWare Client. Using Virtual Loadable
Modules (VLMs) in place of the previous NETx shell, the Universal NetWare
Client can support NetWare 2, 3, 4, and Personal NetWare. The Universal
NetWare Client includes a full suite of connectivity support services
accessible via Microsoft Windows, offering direct access to NetWare file
and print services from Windows File Manager, Print Manager, and Control
Panel.

The Universal NetWare Client also features Optimized Memory Management with
memory swapping, enhanced data transport services using packet burst and
Large Internet Packet (LIP) support, and a Simple Network Management
Protocol (SNMP) client.

The EtherDuo-PCI features self-configuring 10BASE-T (RJ-45) and 10BASE2
(BNC) ports. In addition to the Universal NetWare Client, the adapters
includes NDIS drivers to support Microsoft LAN Manager, IBM LAN Server,
Digital PATHWORKS, Banyan VINES, and Wollongong's Pathway Access; and
Packet Drivers to support FTP, TCP/IP, and NCSA Telnet.

Accton is currently expanding its line of PCI-based networking products to
include additional Ethernet and Token Ring connectivity solutions.

For more information:

Accton Technology Corporation
1962 Zanker Rd
San Jose, CA 95112
(408) 452-8900,  FAX: (408) 452-8988

 ============================================================
 From the  'New Product Information'  Electronic News Service
 ============================================================
 This information was processed from data provided by the
 above mentioned company. For additional details, contact 
 the company at the address or telephone number indicated.
 OmniPage Pro is now used for converting all printed input! 
 ============================================================
 All submissions for this service should be addressed to:
 BAKER ENTERPRISES,  20 Ferro Dr,  Sewell, NJ  08080  U.S.A.
 Email: RBakerPC (AOL/Delphi), rbakerpc@delphi.com (Internet)
 ============================================================
