硬件工程 verilog Cross Domain Signal Integrity in Asynchronous Designs

main sources:

all videos can be found in this playlist

https://www.youtube.com/watch?v=0LVHPRmi88c&list=PLdcY8Cf-O1Zon-8c9NDhgY2F8r_O7oeC4

the main article referenced is:

Synchronizer Techniques for Multi-Clock Domain SoCs & FPGAs - EDN

Single Bit Signals

2-FF

Conventional two flip-flop synchronizer

from Synchronizer Techniques for Multi-Clock Domain SoCs & FPGAs - EDN

In general, a conventional two flip-flop synchronizer is used for synchronizing a single bit level signal. As shown in Figure 1 and Figure 2 , flip flop A and B1 are operating in asynchronous clock domain. There is probability that while sampling the input B1-d by flip flop B1 in CLK_B clock domain, output B1-q may go into metastable state. But during the one clock cycle period of CLK_B clock, output B1-q may settle to some stable value. Output of flop B2 can go to metastable if B1 does not settle to stable value during one clock cycle, but probability for B2 to be metastable for a complete destination clock cycle is very close to zero.

extra:

1. DFF, D-latch and SR-latch from Computer Organization and Deisgn 5th Ed.

flip-flop A memory element for which the output is equal to the value of the stored state inside the element and for which the internal state is changed only on a clock edge.

latch ==> "latch onto" A memory element in which the output is equal to the value of the stored state inside the element and the state is changed whenever the appropriate inputs change and the clock is asserted.

2. why do we need Setup/Hold time constraints for FF/Reg:

https://www.youtube.com/watch?v=3_GDoBc7JdU

in short due to gate delays,

Setup must be observed to ensure input is correctly read by the master FF;

Hold must be observed to ensure the clk negation delay does not coincide with a data change.

Toggle synchronizer

Synchronizer Techniques for Multi-Clock Domain SoCs & FPGAs - EDN

Toggle synchronizer is used to synchronize a pulse generating in source clock domain to destination clock domain. A pulse cannot be synchronized directly using 2 FF synchronizer. While synchronizing from fast clock domain to slow clock domain (while slow to fast is considered safe) using 2 FF synchronizer, the pulse can be skipped which can cause the loss of pulse detection & hence subsequent circuit which depends upon it, may not function properly. Diagram in Figure 3 and Figure 4 shows toggle synchronizer implementation and Timing diagram.

for detailed explanation see: https://www.youtube.com/watch?v=Fs0AJmESX3c

in short, 3 steps (drawing from the video linked)

1. convert the pulse to a level signal, in the example, using mux + reg + complemented output pairs; (to broaden the pulse, feed it through a series of regs and feed all the regs' output to a an Or gate)

(pulse: a signal lasting 1 clock cycle; level: a signal lasting more than 1 cycle)

2. synchronize the converted level signal by a 2FF

3. use an extra reg for delay and Xor gate to recover the pulse signal

in Figure 3, B2 is the 2FF synchonized level signal, with its delay B3, fed to the Xor, which would upon detect and B2 = 1, B3 = 0 and B2 = 0, B3 = 1 yield a pulse in the domain of clkB and stablize to 0 when B2 = 0, B3 = 0.

Handshake based pulse synchronizer

the toggle pulse synchronizer cannot recognize back2back (1 clk period interval) pulses, hence we improve it by feeding back an "ack/busy" signal to stall the next pulse.

Synchronizer Techniques for Multi-Clock Domain SoCs & FPGAs - EDN

In handshake based pulse synchronizer, as shown in Figure 5 and Figure 6 , synchronization of a pulse generated into source clock domain is guaranteed into destination clock domain by providing an acknowledgement. There is one restriction in pulse synchronizer that back to back (one clock gap) pulses cannot be handled. To make sure the next generated pulse in source clock domain gets definitely transferred and synchronized in the destination clock domain, the handshake based pulse synchronizer generates a “Busy” signal by ORing A1 and A3 flip-flop outputs. Thus the logic generating the pulse shall not generate another pulse till the busy signal is asserted.

Figure 6. Timing for Handshake based pulse synchronizer

B2's feedback into A, as A2 then A3 is the "ack" signal.

a video for more detailed explanations:

https://www.youtube.com/watch?v=g4565qWOXF4&list=PLdcY8Cf-O1Zon-8c9NDhgY2F

the mux to the right is converting pulse to level;

the mux to the left is the feedback mux, which set the pulse to 0 after initial activation;

the OR gate together with cached AQ3 ensures that the feedback "ack/busy" remain asserted 1 period after the original signal is set to 0.

this part turn synchronized level to back pulse; the setup, compared to the XOR in toggle, has a shorter pulse width, only at BQ2 = 1, BQ3 = 0.

Bus/Multi-bit Signals

we can of course simply broaden 2FF/toggle array, but that could be costly and they cannot yield safe-enough synchronization in general, hence we have to explore other methods

Quasi-Static Signal and Gray Code

from here

Quasi-static signals. Certain signals – such as reset and other configuration signals – are quasi-static. That is, they are effectively stable for long periods of time. Such crossings do not require synchronizers in the destination domain, because they are held long enough to be captured by even the slowest clock domains without the risk of metastability.

but there can still be glitches during the transition of these signals and since we cannot perform STA to on asynchronous design, great care must be taken in analysing whether the quasi-signals are actually safe during transitions.

multi-bit signals without Gray-coded transition sequences, are very likely to generate glitches and cause undefined behaviors; ==> introduce an extra "valid" signal to screen out transition periods

for more on Gray Code see: https://en.wikipedia.org/wiki/Gray_code

for why Gray Code other than reducing multi-bit to single-bit see: why FIFO design using grey code? | Forum for Electronics

The reason for changing 1 bit at a time is not just reducing time constraints, it is also to simplify error correction problems.

E.g compare increasing from '00111' -> '01000' with using Gray code '00111' -> '01111'. If any error occurs, it would be immediately seen EXCEPT when '00111' -> '00110' or using grey code '00111' -> '00110' (decreasing state). Since only 1 bit changes at a time, for Gray code, the chance for an error to happen (due to timing issues) could be considered lower. Also, an error can also be easily detected using Gray code to check if 2 bits change instead, thus making error correction designs simpler. Whether we really need to translate to binary I think depends on the design. It doesn't seem necessary to me if only the machine has to read/write data to it.

(Recirculation) Mux Synchronizer

https://www.youtube.com/watch?v=jQEv5KNO6Bw

Synchronizer Techniques for Multi-Clock Domain SoCs & FPGAs - EDN

in order to synchronize data, a control pulse is generated in source clock domain when data is available at source flop. Control Pulse is then synchronized using 2 flip flop synchronizer or pulse synchronizer (Toggle or Handshake) depending on clock ratio between source and destination domain. Synchronized control pulse is used to sample the data on the bus in destination domain. Data should be stable until it is sampled in destination clock domain.

==> ! if we use 2FF for the ctrl pulse synchronization, then it's best only used for low to high crossing

Notice, in Mux Synchronizer, since there is no "ack" signal feedback, there is no way to know the data is correctly received. Handshake provide such assurance.

Handshake

read this post for a bit background and some breakdown explanation of the protocol:

Handshaking in Digital Design

A state transition diagram for the handshaking protocol is shown below. This state diagram responds to two inputs: ready and done. It controls one output: start. Elsewhere in the design, there must be other state machines that control the ready and done signals.

Synchronizer Techniques for Multi-Clock Domain SoCs & FPGAs - EDN

In this synchronization scheme request and acknowledge mechanism is used to guarantee the sampling of correct data into destination clock domain irrespective of clock ratio between source clock and destination clock. This technique is mainly used to synchronize vector signal which is not changing continuously or very frequently. (Since) data should remain stable on the bus until synchronized Acknowledge signal is received.

Here is an example of a simple handshake protocol in Verilog:

module handshake (

input clk1, // clock domain 1

input clk2, // clock domain 2

input reset, // reset signal

input data_in, // input data

output reg data_out, // output data

output reg ready // ready signal

);

// Declare internal signals

reg [1:0] state;

// Define states

parameter IDLE_STATE = 2'b00;

parameter DATA_VALID_STATE = 2'b01;

parameter DATA_TRANSFER_STATE = 2'b10;

// State machine

always @(posedge clk1 or posedge reset) begin

if (reset) begin

state <= IDLE_STATE; // reset state

ready <= 1'b0;

end else begin

case(state)

IDLE_STATE: begin

if (data_in) begin

state <= DATA_VALID_STATE;

ready <= 1'b1;

end

DATA_VALID_STATE: begin

if (clk2) begin

data_out <= data_in;

state <= DATA_TRANSFER_STATE;

ready <= 1'b0;

end

DATA_TRANSFER_STATE: begin

if (!data_in) begin

state <= IDLE_STATE;

end

endcase

end

endmodule

In this example, we have two clock domains `clk1` and `clk2`, a reset signal, an input data signal `data_in`, an output data signal `data_out`, and a ready signal `ready`.

The module implements a simple handshake protocol where the sender (in clock domain 1) waits for the receiver (in clock domain 2) to be ready before sending data. The protocol works as follows:

1. The sender waits for the input data signal `data_in` to be asserted. 2. When `data_in` is asserted, the sender asserts the ready signal `ready` and waits for the receiver to be ready. 3. When the receiver is ready (indicated by a rising edge on `clk2`), the sender transfers the data to the receiver by setting `data_out` to `data_in`. 4. The sender deasserts the ready signal `ready` and waits for the receiver to acknowledge the data transfer by deasserting `data_in`. 5. When `data_in` is deasserted, the sender goes back to the idle state and waits for the next data transmission.

https://www.youtube.com/watch?v=DLdzmNkSfG8

depending on the logic of how the "ack" is generated, we may verify to different extend, the integrity of the data transactionm, e.g. for the simplifie graph above, if "ack" is generated by hashing the received data, and there is a counterpart in the transmitter, then "ack" can be used to checksum the transaction.

Another example for simply checking for transaction termination is:

in the Mux Synchronizer, feed the select signal from the receiver's side to the transmitter, as an input to the logic determining the data transmission control signal value.

Asynchronous FIFO synchronization

Recall that Handshake Sync. requires the data to stay stable during the entire transaction and more; for frequently changing data, use async. FIFO.

==> ignore all the signals, focus only on WR/RD/FIFO modules, and the domain crossing.

since wr/rd operations are separated, the only synchronization we care about is the wr/rd_ptr CDC, which is necessary to check for the full/empty status of the FIFO, which is then used to generate wr/rd_en signals for control.

==> the other "sync." that matters is the async. reset synchronization, which is discussed in 异步复位，同步释放 - 知乎.

Synchronizer Techniques for Multi-Clock Domain SoCs & FPGAs - EDN

FIFO is best way to synchronize continuously changing vector data between two asynchronous clock domains. Asynchronous FIFO synchronizer offers solution for transferring vector signal across clock domain without risking metastability and coherency problems.

In Asynchronous FIFO design, FIFO provides full synchronization independent of clock frequency.

Read the article below for a more detailed introduction to FIFO and Async. FIFO

Crossing clock domains with an Asynchronous FIFO

the full/empty cond. of a FIFO is check by:

// The FIFO is empty when both read and write pointers point to the

// same location.

assign o_rempty = (wbin == rbin);

// It is full when wbin-rbin = 2^N. In that case, the bottom AW

// address bits are identical, but the top bit is different.

assign o_wfull = (wbin[AW] != rbin[AW])

&& (wbin[AW-1:0]==rbin[AW-1:0]);

==> the MSB/wrap_bit is specifically added to indicate whether a write addr. wrap around happened (when we filled the entire FIFO and set wr_ptr again, to the addr. where rd_ptr is at, which would be confusing without additional wrap_bit).

video for detailed explanation:

https://www.youtube.com/watch?v=0LVHPRmi88c

Async. FIFO and Gray Code

==> since the addr. change determines a fixed bit change pattern, Gray Code is a perfect match to wr/rd ptrs in CDC

==> one thing to keep in mind is that, when using addr. starting from 0, the FIFO depth (max addr.) must be powers of 2, because we need the wrap around to obey Grat Code as well

==> however, if we are free to set the starting addr. then the following equation can give the start/end addr. for an async. FIFO with arbitrary depth, with Gray Codes

e.g. for a FIFO depth of 520, the Gray Code conforming addr. space is

below is an example verilog for FIFO synchronizer:

module fifo_synchronizer (

input clk_source,

input clk_dest,

input reset,

input data_in,

output reg data_out

);

reg [7:0] fifo [0:7];

reg wr_en = 0, rd_en = 0;

reg [2:0] wr_ptr = 0, rd_ptr = 0;

always @(posedge clk_source, negedge reset) begin

if (~reset) begin

wr_en <= 0;

end else begin

wr_en <= 1;

fifo[wr_ptr] <= data_in;

wr_ptr <= (wr_ptr == 7) ? 0 : wr_ptr + 1;

end

always @(posedge clk_dest, negedge reset) begin

if (~reset) begin

rd_en <= 0;

end else begin

rd_en <= 1;

data_out <= fifo[rd_ptr];

rd_ptr <= (rd_ptr == 7) ? 0 : rd_ptr + 1;

end

endmodule

In this example, the fifo_synchronizer module has two clock inputs (clk_source and clk_dest), a reset input (reset), a data input (data_in), and a data output (data_out). The module uses two FIFO buffers to synchronize the data between the two clock domains. The wr_en and rd_en signals control the write and read operations, respectively. The wr_ptr and rd_ptr signals point to the current write and read positions in the FIFO buffers.

好文推荐

评论可见，请评论后查看内容，谢谢！！！评论后请刷新页面。

夸智网

硬件工程 verilog Cross Domain Signal Integrity in Asynchronous Designs

硬件工程【开关电源三】电源拓扑之正激

硬件工程嵌入式硬件 emmc 5.1 spec 学习---3

发表评论取消回复

夸智网

硬件工程 verilog Cross Domain Signal Integrity in Asynchronous Designs

硬件工程 【开关电源三】电源拓扑之正激

硬件工程 嵌入式硬件 emmc 5.1 spec 学习---3

相关文章

发表评论取消回复

硬件工程【开关电源三】电源拓扑之正激

硬件工程嵌入式硬件 emmc 5.1 spec 学习---3