Path: bloom-picayune.mit.edu!snorkelwacker.mit.edu!americast.com!americast.com\!americast-post
Newsgroups: americast.ieee
From: americast-post@AmeriCast.Com
Organization: American Cybercasting
Approved: americast-post@AmeriCast.com
Subject: Synchronous dynamic RAM
Date: Wed, 4 Nov 92 10:20:19 EST
Message-ID: <1049.art.1992Nov4.102019@AmeriCast.com>

Synchronous dynamic RAM


Betty Prince, Roger Norwood,
Joe Hartigan, Wilbur C. Vogley
Texas Instruments Inc.

Data rates of 500 megabytes per second are expected for a dynamic
RAM with a new architecture, new interfaces, and a new name: the
synchronous dynamic RAM. In other words, the 100-200-MHz
microprocessor of the mid-1990s could function nonstop if this
kind of dynamic RAM (DRAM) were used for main memory. In
addition, in simple cases, main memory might once again handle
special functions like graphics, which currently need specialty
DRAMs to achieve high enough data rates.
     Yet the synchronous DRAM is an evolutionary approach to
high-speed memory, in that it ties together, in a single device,
many of the diverse techniques developed over the years to
augment memory transfers. The plan is for it to become a multi-
sourced, commodity chip.
     SPEED MISMATCH. Standard DRAMs and the new microprocessors
have increasingly diverged in speed. Basic DRAM architecture has
not changed since the early 1970s, whereas design and
architectural innovations have improved microprocessors to the
point where speeds of 500 MHz or more look possible by the end of
the decade. DRAMs, on the other hand, currently take about 60 ns
to access a row address, or 30 ns from address to data available
in page mode, which is about a 25-MHz cycle time. The problem
is compounded by the increasing density of dynamic RAMs, which
further adds to the time needed to access the total memory
through a memory bus of a given width.
     CHIP SOLUTIONS. Still, as DRAM chips have become larger,
chip-level solutions have begun to appear. The cache, for
example, can be included on the DRAM. Fast-access modes such as
page mode, static column mode, or nibble mode have helped to
nearly double basic DRAM speed. Page and static column modes work
by keeping the active row data latched in the sense amplifiers,
and merely selecting new random column addresses. Nibble mode
uses wider internal architectures (by 4) and fast registers to
make a 4-bit parallel-to-serial conversion. The result is fast
access to 4 consecutive bits after a random address.
     Although these modes can enlarge DRAM bandwidth to twice its
normal, non-mode width, they still fall far short of today's
processor needs. Wider external data buses (8 or 16 bits wide,
for example) can also allow more information to be retrieved from
the DRAM on a single access. The penalties here are larger chips
and packages and greater DRAM output noise, all of which tend to
slow down the read access.
     NONPROPRIETARY. In the synchronous DRAM, many manufacturers,
including Texas Instruments, Samsung, Hitachi, Toshiba,
Mitsubishi, and NEC are addressing these speed concerns in an
evolutionary, nonproprietary approach. Initially, they are
developing synchronous DRAMs of 16M-bit capacity. These are
currently being considered for open standardization in the
Electronic Industries Association/Joint Electron Devices
Engineering Council (EIA/Jedec) JC42.3 DRAM Standards Committee.
(While synchronous static RAMs have been around for some years,
the idea of using a dynamic RAM synchronously has only recently
gained acceptance.)
     Historically, DRAMs have been controlled asynchronously: the
processor presents addresses and control levels to the memory,
indicating that a set of data at a particular location in memory
should be either read from or written into the DRAM (for a
 write, the processor also presents the data it wants written).
After a delay--the access time--the RAM either writes the new
information from the processor into its memory or else provides
the information on its outputs for the processor to read.
     During the access-time delay, the DRAM performs various
internal functions, such as activating the high capacitance of
the word lines and bit lines, sensing the data, and routing the
data out through the output buffers. The delay creates a wait
state for the processor; it simply waits for the RAM's response,
and the entire system slows down as a result.
     Under synchronous control, on the other hand, the DRAM
latches information in and out under the control of the system
clock. The processor drops off the instructions for the DRAM in a
set of latches, which stores the addresses, data, and control
signals on the DRAM inputs until the memory can process the
request. The DRAM responds after a set number of clock cycles,
which can be programmed by the user in a special configuration
cycle.
     NO WAITING. Since the processor knows how many clock cycles
it takes for the DRAM to respond, it can safely go off and do
other tasks while the RAM is processing its requests. An example
is a DRAM that has a 60-ns read delay after initial addressing
and is operated with a 10-ns (100-MHz) clock. If the RAM is
asynchronous, the processor waits the full 60-ns access time for
the information. But if the DRAM is synchronous, the processor
can strobe the addresses into a set of input latches and do other
tasks while the memory does the read operation. Then, when the
processor clocks the outputs of the RAM six cycles (60 ns) later,
the data it wants is there.
       In the synchronous DRAM timing diagram, the row address is
strobed in on the rising edge of the system clock, activating a
row or word line of memory bits. A column address is then clocked
in after three clock cycles (30 ns), sufficient time to
activate the word line. The byte of data then appears on the
outputs after three more cycles (another 30 ns) to decode the new
address and get the data from the sense amplifiers through the
output buffers. Six clock cycles after the row address has bee
n clocked in, the processor can expect the requested information
from the output buffers of the synchronous DRAM.
     The architecture of the synchronous DRAM can further shorten
its average access time by pipelining addresses. The input latch
stores the next address the processor will want while the RAM is
operating on the previous address. Normally the addresses to be
accessed are known several cycles in advance by the processor;
therefore, after the address for the first access has been sent
and the RAM has started working on it, the processor can send the
following address to the input latch, so that it is
 available as soon as the first address has moved on to the next
stage of processing. The processor need not wait a full access
cycle before starting its next access to the DRAM.
     Various techniques can also be used inside the synchronous
DRAM to hide the components of the internal timing delay. The
address setup time and word and bit line precharge time can be
eliminated after the first access by using a burst mode, in which
a series of data bits can be clocked out rapidly after the first
bit has been accessed.
     The burst mode in the synchronous DRAM is similar to the old
nibble mode, in which 4 bits of sequential data are provided in
rapid succession without inputting new address information to the
DRAM--but now as much as a full page of data can be provided
[Fig. 1].
     Burst mode is only useful if all the bits to be accessed are
in sequence and in physically the same row of cells as the
initial access. Likely applications include high-speed
memory functions such as video support requiring 100-MHz
data rates.
     Burst mode can be combined with a "wrap" feature. The wrap
gives access, for example, to strings of bits stored both before
and after the initial bit location in rapid succession after the
initial access. This feature is useful for cache filling, since
the most likely bits to be wanted next are those physically close
to the current bit. The user can program both the type of wrap
and the number of bytes available on each wrap.
     If data from different rows is needed, it is still possible
to hide some of the row precharge time if the two rows lie in
different banks. With the synchronous DRAM, the multiple-bank
interleaving previously done at the system level may be moved
onto the chip. Then one bank can be precharged while another is
being accessed. An example is the "nibbled-page" architecture of
Tokyo's Toshiba Corp., in which the data from 8-bit sections of
different columns is interleaved on-chip to give byte-level
random access at a 100-Mb/s rate.
     Another advantage of multiple-bank synchronous DRAMs is that
the active rows (potentially one in each bank) may serve as a
cache. In more detail, once a given row in a given bank is
accessed, it is held active and may be accessed again simply by
supplying a new column address. This method has been used in
page-mode devices, but had only limited success because only one
row in an asynchronous DRAM could be held active.
     ADDING ASSETS. All these features--burst and wrap modes and
interleaved banks--can be combined on a single synchronous DRAM
that runs at up to 400-500 MHz. And still more features may be
added. For example, a data mask control can be used as an
enable/disable pin to ignore inputs or turn the outputs off for a
single clock cycle. This could be useful, especially in write
cycles, if the user wants to access a string of bits, but not all
in the string.
     Another feature, clock enable/disable, turns off the system
clock inside the RAM, thereby suspending the device in its
current state, or puts it into low-power standby mode, saving
energy in battery-powered equipment.
     Synchronous DRAMs need a refresh cycle, since they are still
composed of dynamic cells that lose their charge. But another
option, self-refresh, appears on many of the newer byte-wide
DRAMs.  This new, simpler refresh mode is completely controlled
on the chip and retains data in a low-power mode. The self-
refresh option is expected to be available on synchronous DRAMs
from many vendors.
      The final speed attained by this kind of DRAM in a system
depends not only on its internal architecture but also on the
interface signal levels it uses. A 5- or 3.3-V interface can
function smoothly in, say, a desktop computer running at 50 MHz.
If, how-ever, these same interfaces, with their relatively large
2-V output swings (0.4 V low to 2.4 V high), run in a 125-MHz
system, unterminated lines longer than a few centimeters could
cause delays. Therefore, higher-speed synchronous DRAMs will have
low-swing interfaces such as Gunning transceiver logic (GTL) or
center-tap-terminated (CTT) ["Fast interfaces for DRAMs," pp. 54-
57] to compensate for transmission-line effects.
     SHORTER LEADS. The high speed of synchronous DRAMs
influences their packaging. Most appropriate for them are new
miniature packages such as the thin small-outline package (TSOP)
or various vertical surface-mount packages. These reduce the
effective length of the package wiring and leads, and devices may
be tightly spaced on a circuit board.
     Advances in fast DRAM architecture, high-speed interfaces,
and miniature packaging combine in the synchronous DRAM into a
widely sourced device type that could become the next-generation
commodity DRAM. Whether the promise is fulfilled depends on
several factors: producers must standardize their products and
they must produce them in the high volume needed to bring costs
down. No less important, many mainstream DRAM manufacturers must
offer synchronous DRAMs so that users may have alternative
sources to rely on


Copyright 1992, IEEE Spectrum.

For more information, send-email to American Cybercasting Corporation
(usa@AmeriCast.COM)