Last visit was: Sat Sep 07, 2024 11:27 am
|
It is currently Sat Sep 07, 2024 11:27 am
|
Multi-port Memory Controller
Author |
Message |
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2153 Location: Canada
|
Does anyone by chance have a multi-port memory controller circuit code ? I need about six ports to memory. I'm looking for sample code. I've spent part of a day trying to come up with one interfacing to a ddr2 controller. The ddr2 controller reads and writes 128 bit blocks to a 16 bit wide ddr2 ram. I don't know whether or not to try and cache the data in the controller, or just provide the raw 128 bit. There are a number of bus masters in the system with different requirements.
ch#0: Bitmap controller: read-only 16 bit units ch#1: CPU: read-write 32 bits ch#2: Ethernet controller: read-write 32 bits ch#3: Graphics accelerator: read-write 16 bits ch#4: Audio DMA: read-write 12 bit units ch#5: Sprite DMA: read only, 16 bit units
_________________Robert Finch http://www.finitron.ca
|
Fri Jan 30, 2015 9:04 am |
|
|
MichaelM
Joined: Wed Apr 24, 2013 9:40 pm Posts: 213 Location: Huntsville, AL
|
Rob: I can't provide a full solution for you, but I've implemented this several times over the past few years. My solution has been to use a one-hot arbiter and a Command/Address FIFO and an output data FIFO. The port that is granted the memory, places the RnW command and the address in the Command/Address FIFO, and the memory interface performs the operation specified. If the operation is a memory write operation, then the winning port also writes to the output data FIFO. If it is a memory read operation, the memory interface returns the data along with a data valid strobe on a common input bus, and the winning port captures the data. Providing overlapped I/O for memory writes is fairly easy because the transactions are buffered in the Command/Address and Data Output FIFOs. Thus, after the winning port has written the transaction to these FIFOs, it can release its request to the arbiter. Reads are a bit more complicated if overlapped I/O is needed. When I've needed this behavior, I create a delay line, and use separate read data valid signals for each port. Otherwise I simply require the port needing a memory read operation to hold its request until its requested transaction is completed. This approach doesn't provide the most efficient use of the memory bandwidth, but it reduces the complexity of the read data valid signal. I've in-lined a synthesizable simulation module for a Xilinx DDR2 SDRAM MIG interface, and the arbiter implementation that I've used successfully on several projects. Perhaps the attached modules will provide you a starting point for your application. (The synthesizable DDR2 SDRAM module, which uses internal BRAMs, was included in the final product as a test structure. It emulates the general behavior of Xilinx MIG, but is definitely not an SDRAM controller.) Code: `timescale 1ns / 1ps /////////////////////////////////////////////////////////////////////////////// // Company: Alpha Beta Technologies, Inc. // Engineer: Michael A. Morris // // Create Date: 11:05:11 10/14/2008 // Design Name: CFD DQAM FPGA // Module Name: DDR2_Grnt_Ctrl.v // Project Name: CFD DQAM FPGA // Target Devices: XC5VLX110T-1FF1136 // Tool versions: ISE10.1i // // Description: // // This module is an access controller for the four client interfaces that // supports the DRAM_Ctrl module. A state machine implementation is used to // provide the grant signal outputs. In the Idle state, there are no pending // DDR2 access requests. If an access request is received in the Idle state, // the SM goes to the grant state corresponding the the highest priority re- // quest received. // // The SM stays in the grant state of the highest priority request until that // request is deasserted. Since no combinational logic is used in the SM state // register output, the requesting interface must wait for its grant to be // deasserted after deasserting its request signal before reasserting the // request and starting a new cycle. // // A nested if-else-if structure is used for establishing the priority of // the requests. Thus, although the request signal array appears to use a // parameterized priority, the fixed evaluation sequence in the Idle state // means the priorities are actually fixed. A different structure will be // required if a parameterizable access controller is required in the future. // // Dependencies: None // // Revision: // // 0.00 08J14 MAM Creation Date // // 1.00 09C09 MAM Modified to include driver for the internal DDR2 // bus which drives the DDR2 controller bus to an idle // condition when no peripheral has been granted the // bus. Provides known states for all bus signals. Was // imported from VHDL version, which had been modified // to prevent unknown bus states during behavioral // simulation. // // Additional Comments: // ///////////////////////////////////////////////////////////////////////////////
module DDR2_Grnt_Ctrl( input Rst, input Clk,
input [3:0] Rqst, output reg [3:0] Grnt,
output DDR2_Select, output DDR2_AF_WE, output [ 24:0] DDR2_AF, output [ 2:0] DDR2_Cmd, output DDR2_DO_WE, output [127:0] DDR2_DO );
/////////////////////////////////////////////////////////////////////////////// /////////////////////////////////////////////////////////////////////////////// // // Module Parameter Declarations //
localparam pVME_Priority = 3; localparam pDQMA_Priority = 2; localparam pDQMB_Priority = 1; localparam pPCIe_Priority = 0;
localparam pNull_Grnt = 0;
/////////////////////////////////////////////////////////////////////////////// /////////////////////////////////////////////////////////////////////////////// // // Module Port Definitions //
/////////////////////////////////////////////////////////////////////////////// /////////////////////////////////////////////////////////////////////////////// // // Module Signal Declarations //
/////////////////////////////////////////////////////////////////////////////// /////////////////////////////////////////////////////////////////////////////// // // Module Implementation //
always @(posedge Clk) begin if(Rst) Grnt <= #1 pNull_Grnt; else case(Grnt) pNull_Grnt : if(Rqst[pVME_Priority]) Grnt <= #1 (1 << pVME_Priority); else if(Rqst[pDQMA_Priority]) Grnt <= #1 (1 << pDQMA_Priority); else if(Rqst[pDQMB_Priority]) Grnt <= #1 (1 << pDQMB_Priority); else if(Rqst[pPCIe_Priority]) Grnt <= #1 (1 << pPCIe_Priority); else Grnt <= #1 pNull_Grnt;
(1 << pVME_Priority) : if(Rqst[pVME_Priority]) Grnt <= #1 (1 << pVME_Priority); else Grnt <= #1 pNull_Grnt;
(1 << pDQMA_Priority) : if(Rqst[pDQMA_Priority]) Grnt <= #1 (1 << pDQMA_Priority); else Grnt <= #1 pNull_Grnt;
(1 << pDQMB_Priority) : if(Rqst[pDQMB_Priority]) Grnt <= #1 (1 << pDQMB_Priority); else Grnt <= #1 pNull_Grnt;
(1 << pPCIe_Priority) : if(Rqst[pPCIe_Priority]) Grnt <= #1 (1 << pPCIe_Priority); else Grnt <= #1 pNull_Grnt;
default : Grnt <= #1 pNull_Grnt; endcase end
// // Drive the DDR2 bus to idle state when no peripheral // granted the bus
assign DDR2_Select = ((~|Grnt) ? 1'b0 : 1'bZ); assign DDR2_AF_WE = ((~|Grnt) ? 1'b0 : 1'bZ); assign DDR2_AF = ((~|Grnt) ? 31'b0 : 31'bZ); assign DDR2_Cmd = ((~|Grnt) ? 3'b0 : 3'bZ); assign DDR2_DO_WE = ((~|Grnt) ? 1'b0 : 1'bZ); assign DDR2_DO = ((~|Grnt) ? 128'b0 : 128'bZ);
endmodule
Code: `timescale 1ns / 1ps /////////////////////////////////////////////////////////////////////////////// // Company: Alpha Beta Technologies, Inc. // Engineer: Michael A. Morris // // Create Date: 15:45:54 10/30/2008 // Design Name: CFD DQAM FPGA // Module Name: DDR2_IF_Sim // Project Name: DQAM_DSP // Target Devices: XC5VLX110T-1FF1136 // Tool versions: ISE10.1i // // Description: // // This module implements a model of the DDR2 SDRAM interface with Block RAM. // The objective is emulate the basic interface and timing of the command I/F // and the return (read) data interface. The module implements this emulation // using a simple state machine and some timers which provide response delays // with respect to the address and write data, and the read data path. There // is no attempt to simulate delays which naturally occur in the DDR2 SDRAM // when new rows and banks are accessed. Neither is there any attempt to simu- // late timing variations due to DRAM refresh cycles. // // FIFOs are used on the Address (AF, Cmd) and Write Data (DO) signal paths, // but the return (read) data and data valid are not buffered through FIFOs. // Instead, they are single buffered in registers, and must be saved by the // user logic immediately or they are lost. This reflects the operational // characteristics of the Xilinx DDR2 SDRAM controller. // // The address interface matches that of DDR2 SDRAM Controller. The full 25 // bit address is used on the interface, although only the bottom addresses // and the two most significant addresses, the BA[1:0] address bits, are used. // Thus, this module presents the same address interface to the VME as does // the DDR2 SDRAM module, but allows the addresses to roll around the actual // size of the Block RAM array. // // The base implementation is for a single Block RAM of 4 16kB blocks. The // actual organization matches the 128-bit data width of the DDR2 SDRAM inter- // face. Thus, the Block RAM is actually organized as 4x1024x128. // // An initial block is included, but commented out, to initialize the contents // of the block RAM. // // Dependencies: DPSFmnCE.v // // Revision: // // 0.00 08J30 MAM Creation Date // // 1.00 09E08 MAM Added header comments // // Additional Comments: // ///////////////////////////////////////////////////////////////////////////////
module DDR2_IF_Sim( Rst, Clk,
DDR2_Phy_Init_Done, //c0_phy_init_done(),
DDR2_AF_WE, //c0_app_af_wren(), DDR2_AF_FF, //c0_app_af_afull(), DDR2_AF, //c0_app_af_addr(), DDR2_Cmd, //c0_app_af_cmd(),
DDR2_DO_WE, //c0_app_wdf_wren(), DDR2_DO_FF, //c0_app_wdf_afull(), DDR2_DO, //c0_app_wdf_data(),
DDR2_DI_Valid, //c0_rd_data_valid(), DDR2_DI, //c0_rd_data_fifo_out(),
// Debug Outputs/Internal Test Points
SM, C, Read_Cmd, TC_CmdDlyTmr,
DDR2_CmdAF_DI, DDR2_CmdAF_RE, DDR2_CmdAF_DO,
RAM_WE, RAMAddr, RAM_DI, RAM_DO );
/////////////////////////////////////////////////////////////////////////////// // // Module Parameters //
parameter pDQAM_Width = 32; parameter pDDR2AddrWidth = 25; parameter pDDR2CmdWidth = 3; parameter pDDR2MskWidth = 8; parameter pMemAddrWidth = 12; parameter pDDR2SysDelay = 18; parameter pInputFIFOAddr = 7;
// Derived parameters
localparam pDDR2DataWidth = 4*pDQAM_Width; localparam pCmdMSB = (pDDR2CmdWidth + pMemAddrWidth - 1); localparam pCmdLSB = (pDDR2AddrWidth);
// Local State Declarations
localparam pIdle = 3'b000; localparam pLd_CmdDlyTmr = 3'b001; localparam pWt_CmdDlyTmr = 3'b011; localparam pRd_RAM0 = 3'b010; localparam pRd_RAM1 = 3'b110; localparam pWr_RAM0 = 3'b111; localparam pWr_RAM1 = 3'b101; localparam pDelay = 3'b100;
/////////////////////////////////////////////////////////////////////////////// // // Module Port Declarations //
input Rst; input Clk;
output DDR2_Phy_Init_Done;
input DDR2_AF_WE; output DDR2_AF_FF; input DDR2_AF; input DDR2_Cmd;
input DDR2_DO_WE; output DDR2_DO_FF; input DDR2_DO;
output DDR2_DI_Valid; output DDR2_DI;
// Debug Outputs/Internal Test Points
output SM; output C; output Read_Cmd; output TC_CmdDlyTmr;
output DDR2_CmdAF_DI; output DDR2_CmdAF_RE; output DDR2_CmdAF_DO;
output RAM_WE; output RAMAddr; output RAM_DI; output RAM_DO;
/////////////////////////////////////////////////////////////////////////////// // // Module Signal Declarations //
reg DDR2_Phy_Init_Done;
wire DDR2_AF_FF; wire [(pDDR2AddrWidth - 1):0] DDR2_AF; wire [ (pDDR2CmdWidth - 1):0] DDR2_Cmd;
wire DDR2_DO_FF; wire [(pDDR2DataWidth - 1):0] DDR2_DO;
reg DDR2_DI_Valid; reg [(pDDR2DataWidth - 1):0] DDR2_DI;
/////////////////////////////////////////////////////////////////////////////// // // Signal Declarations //
reg [(pDDR2DataWidth - 1):0] RAM [((2**pMemAddrWidth) - 1):0]; wire [(pDDR2DataWidth - 1):0] RAM_DI; reg [(pDDR2DataWidth - 1):0] RAM_DO;
wire [(pDDR2AddrWidth + pDDR2CmdWidth - 1):0] DDR2_CmdAF_DI, DDR2_CmdAF_DO; wire DDR2_CmdAF_EF;
wire M; wire [2:0] C;
reg [4:0] CmdDlyTmr; reg TC_CmdDlyTmr;
wire Inc_RAMAddr; reg [(pMemAddrWidth - 1):0] RAMAddr;
reg [9:0] Phy_Done_Tmr;
(* FSM_ENCODING="SEQUENTIAL", SAFE_IMPLEMENTATION="NO" *) reg [2:0] SM = pIdle;
reg RAM_RE;
/////////////////////////////////////////////////////////////////////////////// // // Module Implementation //
// DDR2 Phy Init Done Simulation
always @(posedge Clk) begin if(Rst) Phy_Done_Tmr <= #1 0; else if(~DDR2_Phy_Init_Done) Phy_Done_Tmr <= #1 (Phy_Done_Tmr + 1); end
always @(posedge Clk) begin if(Rst) DDR2_Phy_Init_Done <= #1 0; else if(~DDR2_Phy_Init_Done) DDR2_Phy_Init_Done <= #1 &Phy_Done_Tmr; end
/////////////////////////////////////////////////////////////////////////////// /////////////////////////////////////////////////////////////////////////////// // // DDR2 Controller Address and Command FIFO // // Generate Command/Address FIFO Read Strobe
assign DDR2_CmdAF_RE = ( ((SM == pWt_CmdDlyTmr) & TC_CmdDlyTmr) | ((SM == pRd_RAM1) & ~DDR2_CmdAF_EF & Read_Cmd) | ((SM == pWr_RAM1) & ~DDR2_CmdAF_EF & ~Read_Cmd));
// Aggregate input data
assign DDR2_CmdAF_DI = {DDR2_Cmd, DDR2_AF};
DPSFnmCE #( .addr(pInputFIFOAddr), .width(pDDR2AddrWidth + pDDR2CmdWidth) ) FIFO1 ( .Rst(Rst), .Clk(Clk),
.WE(DDR2_AF_WE), .FF(DDR2_AF_FF), .DI(DDR2_CmdAF_DI),
.RE(DDR2_CmdAF_RE), .EF(DDR2_CmdAF_EF), .DO(DDR2_CmdAF_DO),
.HF(), .Cnt() );
// Generate dummy equations for M(SB) & C(md) to preserve the inputs
assign M = &DDR2_CmdAF_DO[(pDDR2AddrWidth+pDDR2CmdWidth-1):(pMemAddrWidth+1)]; assign C = {M, DDR2_CmdAF_DO[0], DDR2_CmdAF_DO[pCmdLSB]}; // Wr - 000, Rd- 001
assign Read_Cmd = (C[2:0] == 3'b001);
/////////////////////////////////////////////////////////////////////////////// /////////////////////////////////////////////////////////////////////////////// // // DDR2 Controller Data Output FIFO //
DPSFnmCE #( .addr(pInputFIFOAddr), .width(pDDR2DataWidth) ) FIFO2 ( .Rst(Rst), .Clk(Clk),
.WE(DDR2_DO_WE), .FF(DDR2_DO_FF), .DI(DDR2_DO),
.RE(RAM_WE), .EF(), .DO(RAM_DI),
.HF(), .Cnt() );
/////////////////////////////////////////////////////////////////////////////// /////////////////////////////////////////////////////////////////////////////// // // Generate Simulated DDR2 Controller Delay //
assign Ld_CmdDlyTmr = (SM == pLd_CmdDlyTmr); assign CE_CmdDlyTmr = ~TC_CmdDlyTmr;
always @(posedge Clk) begin if(Rst) CmdDlyTmr <= #1 0; else if(Ld_CmdDlyTmr) CmdDlyTmr <= #1 (pDDR2SysDelay - 1); else if(CE_CmdDlyTmr) CmdDlyTmr <= #1 CmdDlyTmr - 1; end
always @(posedge Clk) begin if(Rst) TC_CmdDlyTmr <= #1 1; else if(Ld_CmdDlyTmr) TC_CmdDlyTmr <= #1 0; else if(~TC_CmdDlyTmr) TC_CmdDlyTmr <= #1 ~|CmdDlyTmr; //(CmdDlyTmr == 0) end
/////////////////////////////////////////////////////////////////////////////// /////////////////////////////////////////////////////////////////////////////// // // DDR2 Simulation RAM (by default parameters: 4kx128 => 64kB) // // Simulated DDR2 RAM Address Register/Incrementer
assign Inc_RAMAddr = ((SM == pRd_RAM0) | (SM == pWr_RAM0));
always @(posedge Clk) begin if(Rst) RAMAddr <= #1 0; else if(DDR2_CmdAF_RE) RAMAddr <= #1 {DDR2_CmdAF_DO[(pDDR2AddrWidth - 1)], // BA[1] DDR2_CmdAF_DO[(pDDR2AddrWidth - 2)], // BA[0] DDR2_CmdAF_DO[(pMemAddrWidth - 2):1]}; // Linear else if(Inc_RAMAddr) RAMAddr <= #1 (RAMAddr + 1); end
initial $readmemh("DDR2_RAM_1.coe",RAM, 0, (2**pMemAddrWidth - 1));
assign RAM_WE = ((SM == pWr_RAM0) | (SM == pWr_RAM1));
// Write First Mode Single-Port Block RAM
always @(posedge Clk) begin if(RAM_WE) begin RAM[RAMAddr] <= #1 RAM_DI; RAM_DO <= #1 RAM_DI; end else RAM_DO <= #1 RAM[RAMAddr]; end
// Registered data outputs of Simulated DDR2 Controller
always @(posedge Clk) begin if(Rst) RAM_RE <= #1 0; else RAM_RE <= #1 ((SM == pRd_RAM0) | (SM == pRd_RAM1)); end
always @(posedge Clk) begin if(Rst) DDR2_DI_Valid <= #1 0; else DDR2_DI_Valid <= #1 RAM_RE; end
always @(posedge Clk) begin if(Rst) DDR2_DI <= #1 0; else if(RAM_RE) DDR2_DI <= #1 RAM_DO; end
/////////////////////////////////////////////////////////////////////////////// /////////////////////////////////////////////////////////////////////////////// // // DDR2 Controller Simulation State Machine //
assign Rst_SM = Rst | ~DDR2_Phy_Init_Done;
always @(posedge Clk) begin if(Rst_SM) SM <= #1 pIdle; else (* FULL_CASE, PARALLEL_CASE *) case(SM) pIdle : begin if(DDR2_CmdAF_EF) SM <= #1 pIdle; else SM <= #1 pLd_CmdDlyTmr; end
pLd_CmdDlyTmr : begin SM <= #1 pWt_CmdDlyTmr; end
pWt_CmdDlyTmr : begin if(~TC_CmdDlyTmr) SM <= #1 pWt_CmdDlyTmr; else if(Read_Cmd) SM <= #1 pRd_RAM0; else SM <= #1 pWr_RAM0; end
pRd_RAM0 : begin SM <= #1 pRd_RAM1; end
pRd_RAM1 : begin if(DDR2_CmdAF_EF) SM <= #1 pDelay; else if(Read_Cmd) SM <= #1 pRd_RAM0; else SM <= #1 pLd_CmdDlyTmr; end
pWr_RAM0 : begin SM <= #1 pWr_RAM1; end
pWr_RAM1 : begin if(DDR2_CmdAF_EF) SM <= #1 pDelay; else if(Read_Cmd) SM <= #1 pLd_CmdDlyTmr; else SM <= #1 pWr_RAM0; end
pDelay : begin SM <= #1 pIdle; end
default : SM <= #1 pIdle; endcase end
endmodule
_________________ Michael A.
|
Fri Jan 30, 2015 11:08 pm |
|
|
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2153 Location: Canada
|
I chose to cache read data in the controller. Reads from the caches can overlap between different devices. Writes are one-at-a-time to a fifo and may invalidate read caches. Here is what I've come up with so far (untested): http://github.com/robfinch/Cores/blob/master/memory/trunk/rtl/verilog/mpmc.vA fixed priority arbitrator is used. Most of the devices are periodic so it may be possible to work out a fixed access scheme. The ddr2 controller itself is taken from the demo-project done by the board vendor. It's a core-generator component.
_________________Robert Finch http://www.finitron.ca
|
Wed Feb 04, 2015 6:28 am |
|
|
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2153 Location: Canada
|
The mpmc has been tested with two devices. It seems that at least the cpu memory port is working. Memory can be filled and dumped back successfully. Also the cpu stack which is in memory works. The bitmapped controller displays bitmap data but only for about 55-60% of the screen width, after which the screen is blank. This is probably a problem with the bitmapped controller and not the multi-port memory.
I changed the channel setup somewhat and added a channel for an MMU. The cpu channel has been moved down to the lowest priority so that it doesn't hog 100% of the bus bandwidth. Sprite DMA had to be 32 bits to be consistent with the memories accessibility in both slave and master modes.
ch#0: Bitmap controller: read-only 128 bit units ch#1: MMU: read-write 32 bits ch#2: Ethernet controller: read-write 32 bits ch#3: Graphics accelerator: read-write 32 bits ch#4: Audio DMA: read-write 12 bit units ch#5: Sprite DMA: read only, 32 bit units ch#6: CPU: read-write 32 bits
_________________Robert Finch http://www.finitron.ca
|
Sun Feb 08, 2015 11:57 pm |
|
|
robfinch
Joined: Sat Feb 02, 2013 9:40 am Posts: 2153 Location: Canada
|
I found a way to improve the test system's memory controller (mpmc2) so that it can return data at a much higher rate. Basically, I make some use of pipelined memory access rather than using a strictly synchronous approach to bus transactions. It's not really fully overlapped read-access except during a burst .Some burst accesses now sit in a loop issuing multiple read address requests before expecting anything back. Read data is cached in the controller for each channel. For channel #0, the bitmapped graphics channel, four burst read memory accesses are started with data filling the cache as the read data comes back.
To test out the memory controller I've been using a bitmapped graphics controller which does DMA access through the memory controller. With the old memory controller it couldn't keep up to hi-res bitmapped displays. The display would go blank half way across the screen because of a lack of memory bandwidth. With the new controller higher resolution displays are possible.
_________________Robert Finch http://www.finitron.ca
|
Thu Apr 28, 2016 4:22 am |
|
Who is online |
Users browsing this forum: CCBot and 0 guests |
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot post attachments in this forum
|
|