**这是本文档旧的修订版!**

FPGA内部构成

逻辑单元(Logic-cells) FPGAs are built from one basic “logic-cell”, duplicated hundreds or thousands of time. A logic-cell is basically a small lookup table (“LUT”), a D flip-flop and a 2-to-1 mux (to bypass the flip-flop if desired).

The LUT can implement any logic function. It has typically a few inputs (4 in the drawing above), so for example an AND gate with 3 inputs, whose result is then OR-ed with another input would fit in one 4-input LUT.

互连 Each logic-cell can be connected to other logic-cells through interconnect resources (wires/muxes placed around the logic-cells). Each cell can do little, but with lots of them connected together, complex logic functions can be created.

输入输出单元 The interconnect wires also go to the boundary of the device where I/O cells are implemented and connected to the pins of the FPGAs.

特定的布线(routing/carry chains) In addition to general-purpose interconnect resources, FPGAs have fast dedicated lines in between neighboring logic cells. The most common type of fast dedicated lines are “carry chains”. Carry chains allow creating arithmetic functions (like counters and adders) efficiently (low logic usage & high operating speed). For more info, check this page.

Older programmable technologies (PAL/CPLD) don't have carry chains and so are quickly limited when arithmetic operations are required.

In addition to logic, all new FPGAs have dedicated blocks of static RAM distributed among and controlled by the logic elements.

内部RAM工作模式

There are many parameters affecting RAM operation. The main parameter is the number of agents that can access the RAM simultaneously.

“single-port” RAMs: only one agent can read/write the RAM. “dual-port” or “quad-port” RAMs: 2 or 4 agents can read/write. Great to get data across clock domains (each agent can use a different clock). Here's a simplified drawing of a dual-port RAM.

To figure out how many agents are available, count the number of separate address buses going to the RAM. Each agent has a dedicated address bus. Each agent has also a read and/or a write data bus.

Writing to the RAM is usually done synchronously. Reading is usually done synchronously but can sometimes be done asynchronously.

Blockram vs. Distributed RAM Now there are two types of internal RAMs in an FPGA: blockrams and distributed RAMs. The size of the RAM needed usually determines which type is used.

The big RAM blocks are blockrams, which are located in dedicated areas in the FPGA. Each FPGA has a limited number of these, and if you don't use them, you “loose” them (they cannot be used for anything but RAM). The small RAM blocks are either in smaller blockrams (Altera does that), or in “distributed RAM” (Xilinx does that). Distributed RAM allows using the FPGA logic-cells as tiny RAMs which provides a very flexible RAM distribution in an FPGA, but isn't efficient in term of area (a logic-cell can actually hold very few bits of RAM). Altera prefers building different size blockrams around the device (more area efficient, but less flexible). Which one is better for you depends on your FPGA application.

FPGA管脚分配

FPGAs tend to have lots of pins… So to make it a little simpler, let's put them into two bins: “user pins” and “dedicated pins”.

用户管脚 The user pins are called “IOs”, or “I/Os”, or “user I/Os”, or “user IOs”, or “IO pins”, or … you get the idea. IO stands for “input-output”.

  • You usually have total control over user IOs. They can be programmed to be inputs, outputs, or bi-directional (i.e. with tri-statable buffers).
  • Each IO pin is connected to an “IO cell” inside the FPGA. The “IO cells” are powered by the VCCIO pins (IO power pins) - more details below.

固定管脚 The “dedicated pins” are hard-coded to a specific function. They fall into the three following sub-categories.

  • 电源管脚
  • 配制管脚: used to “download” the FPGA.
  • 固定输入或时钟管脚: these are able to drive large nets inside the FPGA, suitable for clocks or signals with large fan-outs.

The power pins fall into two categories: “core voltage” and “IO voltage”.

  • The core voltage is named “VCC” for Xilinx and “VCCINT” for Altera. It is fixed (set by the model of FPGA that you are using). It is used to power the logic gates and flip-flops inside the FPGA. The voltage was 5V for older FPGA generations, and is coming down as new generations come (3.3V, 2.5V, 1.8V, 1.5V, 1.2V and even lower for the latest devices).
  • The IO voltage is named “VCCO” for Xilinx and “VCCIO” for Altera. It is used to power the I/O blocks (= pins) of the FPGA. That voltage should match what the other devices connected to the FPGA expect.

An FPGA has many VCCIO pins that may be all powered by the same voltage. But new generations of FPGAs have a concept of “user IO banks”: the IOs are split into groups, each having its own VCCIO pins. That allows using the FPGA as a voltage translator device, useful for example if one part of your board works with 3.3V logic, and another with 2.5V.

FPGA时钟的使用和处理

An FPGA design is usually “synchronous”. Simply put, that means that the design is clock based and each clock rising edge allows all the D flip-flops to simultaneously take a new state.

In a synchronous design, a single clock may drive a lot of flip-flops. That can cause timing and electrical problems inside the FPGA. To get that working properly, FPGA manufacturers provide special internal wires called “global routing” or “global lines”. They allow distributing the clock signal all over the FPGA with a low skew (i.e. the clock signal appears almost simultaneously to all the flip-flops).

Most FPGA designs use at least one clock that is generated outside the FPGA and then fed to the FPGA through one pin. Just make sure you use a clock pin (only them have the ability to drive global lines).

Clock domains An FPGA can use multiple clocks (using multiple global lines and clock pins). Each clock forms a “clock domain” inside the FPGA.

Flip-flops and combinatorial logic in each clock domain For each flip-flop inside the FPGA, its clock domain is easy to determine. Just look at the flip-flop clock input. But what about the combinatorial logic that sits in between flip-flops?

  • If there is some combinatorial logic in between “same clock domain” flip-flops, the logic is said to be part of the clock domain too.
  • If there is some combinatorial logic in between “different clock domains” flip-flops, the logic is not owned by any clock domain. But in a typical FPGA design, there is no such logic; the only paths from different clock domains are synchronizers.

Clock domain speeds For each clock domain, the FPGA software will analyze all flop-to-flop paths and give you a report with the maximum allowed frequencies. In the general case, only the paths from within each clock domains are analyzed. The synchronizer paths (from different clock domains) usually don't matter and are not analyzed.

One clock domain may work at 10MHz, while another may work at 100MHz. As long as each clock uses a global line, and you use clock speeds that are lower than the maximums reported by the software, you don't have to worry about internal timing issues, the design is guaranteed to work internally timing-wise.

There may still be some timing issues from the FPGA input and output pins though. The software will give you a report about that. See also the next section.

Signals between clock domains If you need to send some information across different clock domains, special considerations need to apply.

In the general case, if your clocks have no relationship with one another, you cannot use a signal generated from one clock domain into another as-is. Doing so would violate setup and hold flip-flop timings (in the destination clock domain), and cause metastability.

Crossing clock domains requires special techniques, like the use of synchronizers (that's simple), or FIFOs (that's more complicated). See the Crossing clock domains project to get some practical ideas, plus Interfacing Two Clock Domains and What Is Metastability?.