Introduction

The motivation for designing this high speed serial link is to provide approximately 1Gbit/sec bandwidth with low power usage and low pin requirements to the IRAM project. This can be accomplished by using point to point links and clock recovery techniques to sample the incoming data. Avoiding the loading capacitance of multiple receivers on a single bus and the associated signal reflection problems allows the rate at which data can be sent reliably to be at a much higher frequency. If higher connectivity to a device is required, multiple serial links can be used in parallel and coordination of the data can be handled in a higher level.

Serial links are advantageous when signals need to be transmitted long distances. Using a serial means to transmit information requires fewer wires than a parallel scheme. A clock wire can be avoided by using clock recovery techniques. A system where both sender and receiver have clocks with nearly identical clock frequencies is called plesiochronous. The receiver can take advantage of the fact that it knows the approximate data rate and can then extract the data from the incoming signal. This is the scenario in which many of the serial links described in the literature are designed to be used.

The scenario in which the serial link for this project will be used is different, however, and has guided many of our design decisions. This serial link will be used for low latency, interchip communication and for data transfer to disk. For communication to disk, an interface chip which will communicate to the disks using the appropriate protocol (FCAL) will be required. We are currently pursuing the possibilities of this interface chip with interested industrial partners, as the design of a very complicated interface chip is not a project we wish to undertake. Assuming this chip is available, the serial link will only need to communicate to other neighboring integrated circuits on the same printed circuit board. Minimizing pin count is still a desirable goal, however, so expandable, high speed serial links fit the application well. Because communication will only occur on a single PCB, we now have a mesochronous system. That is, both sender and receiver share the same clock. Once this assumption has been made, interchip communication with clock recovery becomes a simper issue because the differences in the local clocks of the various ICs will be absolutely static with the exception of very minor phase drift. The only causes of clock drift will now be due to temperature variations and low frequency supply noise.

There are two basic methods which can be used to recover the clock of the incoming data: oversampling and tracking. Oversampling techniques require at least 3 samples per bit. The samples are compared with neighboring samples using a majority voting logic block to determine the correct data, as in [Lee95]. Tracking schemes adjust the phase of the sampling clocks so that the edge clocks align with the edge of the data eye and the data clocks align with the center of the data eye. Tracking schemes therefore require 2 samples per bit - one for data and one for edges between data bits [Dal97]. Edge and data samples are compared with their neighboring samples to determine which direction to adjust the receive clocks. The two schemes are compared in figure 1.1.1.

Figure 1.1.1 - Comparison of Data Recovery Schemes

Oversampling receivers have several advantages and disadvantages over tracking receivers. Oversampling receivers reject high frequency jitter because they determine whether a bit was a zero or a one by comparing it directly with its nearest neighbors. When a new set of samples is then examined, if the transitions have shifted because of jitter, the oversampling action samples the data just as it did before and again tries to sort out what the actual data was. It does not have to track jitter the way a tracking receiver does. Oversampling receivers are simpler to implement since no phase interpolator is required, and the logic used to sort out the data runs at the slow clock rate of the chip. They do, however, require a faster sampling rate and the area required by the logic block is significant. Also, quantization jitter, the uncertainty in the position of each detected transition, is introduced by the oversampling receiver.

In general, oversampling is preferred if the high-frequency jitter that will be rejected by the oversampling receiver is greater than the quantization jitter that oversampling introduces [Dal97]. However, in our case both sender and receiver will be sharing the same system clock. This means little jitter will be injected by the clock itself. The other source of jitter could be the DLL. As long as the DLL is broadbanded and made to track all low frequency jitter, only a small amount of high frequency jitter will be allowed. Thus the high frequency jitter on the data stream should be small and the tracking scheme is preferrable. In addition, the design and power consumption of the large (but digital) oversampling logic block is avoided.

Basic Operation

The block diagram of the transceiver we plan to implement is shown in Figure 1.1.2. The transceiver takes 8 bits of data as input each clock cycle. This data is then encoded using the IBM 8b/10b encoding scheme [Wid83]. Using the precisely delayed clocks from the DLL, this data is then serialized into a low swing differential signal and sent off chip.

Figure 1.1.2 - Block Diagram of Transceiver

The receiver uses a tracking loop to adjust exactly when input data is sampled [Dal97]. A phase interpolator is used to adjust the sampling clocks in fine increments, roughly 60 ps apart [Sid97]. When the sampling points are aligned with the data, bit lock is acquired. Byte lock is then acquired by then searching through the sampled data bits and finding the predefined 10 bit sequence that is being sent to the receiver. Once byte lock is acquired, data may be sent over the link.