Mapping SRLs to registers, MLABs, or Block RAMs

cancel
Showing results for 
Search instead for 
Did you mean: 

Mapping SRLs to registers, MLABs, or Block RAMs

Mapping SRLs to registers, MLABs, or Block RAMs

This wiki page is dedicated towards users that would like more flexibility in mapping SRL like logic elements to registers, MLABs, or Block RAM. The specific example attached to this page is mapped to a Stratix V. However, the concepts apply to any device that uses registers, distributed memory, and block RAM.

The following users should be in interested in this page:

  • Users converting Xilinx designs to Altera
  • Users seeing SRLs being mapped to block RAM based altshift_taps and wanting more choices in mapping to registers or distributed MLAB memory.
  • Users wanting more design density
  • Users wanting better timing performance
  • Users not able to write inferred SRL RTL that maps to registers or distributed memory

The author of this page has had success in converting multiple Xilinx based designs that relyed on heavy usage of SRL16 and SRL32 RAM based shift registers.

The example design listed in this article shows how a user can map SRL elements using VHDL. The same concept can also be applied if wanting to map SRLs to registers, MLABs, or block RAMs through Verilog.

The following top level code has been written as an exmaple to show how VHDL generics can be used to pass SRL size and width into a lower piece of VHDL code that can map an SRL to a register, MLAB, or block memory based upon the size and width generics. A user can adjust the the thresholds based upon their needs for mapping into registers, MLABs, or Block RAMs.

The top level file for the design is shown below. The design has 6 instances of the srl_array3.vhd file. Each instance has different SRL size and width parameters passed down to srl_array3.vhd so a user can see how Quartus will map the SRL into gates based upon the thresholds set in srl_array3.vhd. (Note: as of 10/8/14 using Quartus 13.1, instances 5 and 6 are packed into the same MLAB.).

Update in Quartus Synthesis capabilities: The design below infers a RAM and uses the ramstyle attribute to determine which block it goes into. Then the pointers are added around it in HDL to create a shift-register, so it is manually creating an altshift_taps instead of inferring one. Quartus 13.0 added the capability for ramstyle to be used directly in an inferred shift-register. Here is an example in SystemVerilog of a large SR that would normally go into M20Ks but I am forcing into MLABs:

module test (
input clk,
input [4:0] din,
output reg [4:0] dout);

(* ramstyle = "MLAB" *) reg [4:0] sr [1023:0]; //"MLAB", "M20K", "logic", etc.

always @ (posedge clk) begin
sr[0] <= din;
sr[1023:1] <= sr[1022:0];
dout <= sr[1023];
end
endmodule

So the new support for the ramstyle attribute on the shift-register is probably an easier flow, but the following VHDL is excellent for anything pre-Q13.0, or if you don't want to infer altshift_taps:

LIBRARY ieee ;
USE ieee.std_logic_1164.ALL;
USE ieee.std_logic_arith.ALL;
USE ieee.std_logic_unsigned.ALL;

ENTITY Stratix5_Top IS
PORT
(
inclk : IN std_logic;
reset_n : IN std_logic;
enable : IN std_logic;
SRLin : IN std_logic_vector(9 downto 0);
SRLsel : IN std_logic_vector(1 downto 0);
SRLout : OUT std_logic_vector(9 downto 0)
);

END Stratix5_Top;

ARCHITECTURE struct OF Stratix5_Top IS

constant WIDTH : integer := 10;

-- Internal signal declarations
signal SRLout1 : std_logic_vector(WIDTH-1 downto 0);
signal SRLout2 : std_logic_vector(WIDTH-1 downto 0);
signal SRLout3 : std_logic_vector(WIDTH-1 downto 0);
signal SRLout4 : std_logic_vector(0 downto 0);
signal SRLout5 : std_logic_vector(0 downto 0);

--attribute keep : boolean;
--attribute keep of logic1a : signal is true;
--attribute keep of logic1b : signal is true;

COMPONENT SRL_Test
GENERIC (
WIDTH : natural := 4;
SIZE : natural := 4
);
PORT (
clk : IN std_logic;
reset_n : IN std_logic;
enable : IN std_logic;
indata : IN std_logic_vector(WIDTH-1 downto 0);
outdata : OUT std_logic_vector(WIDTH-1 downto 0)
);
END COMPONENT;

BEGIN
-- Update the register output on the clock's rising edge
SRL_Test_inst1 : SRL_Test
GENERIC MAP (
WIDTH => 10,
SIZE => 1024
)
PORT MAP (
clk => inclk,
reset_n => reset_n,
enable => enable,
indata => SRLin,
outdata => SRLout1
);

SRL_Test_inst2 : SRL_Test
GENERIC MAP (
WIDTH => 10,
SIZE => 640
)
PORT MAP (
clk => inclk,
reset_n => reset_n,
enable => enable,
indata => SRLout1,
outdata => SRLout2
);

SRL_Test_inst3 : SRL_Test
GENERIC MAP (
WIDTH => 10,
SIZE => 2
)
PORT MAP (
clk => inclk,
reset_n => reset_n,
enable => enable,
indata => SRLout2,
outdata => SRLout3
);

SRL_Test_inst4 : SRL_Test
GENERIC MAP (
WIDTH => 1,
SIZE => 40
)
PORT MAP (
clk => inclk,
reset_n => reset_n,
enable => enable,
indata => SRLout3(0 downto 0),
outdata => SRLout4
);

SRL_Test_inst5 : SRL_Test
GENERIC MAP (
WIDTH => 1,
SIZE => 65
)
PORT MAP (
clk => inclk,
reset_n => reset_n,
enable => enable,
indata => SRLout4,
outdata => SRLout5
);

SRL_Test_inst6 : SRL_Test
GENERIC MAP (
WIDTH => 1,
SIZE => 65
)
PORT MAP (
clk => inclk,
reset_n => reset_n,
enable => enable,
indata => SRLout5,
outdata => SRLout(0 downto 0)
);

SRLout(WIDTH-1 downto 1) <= SRLout3(WIDTH-1 downto 1);

END struct;

The VHDL component SRL_Test is shown next. This block was created to experiment with different versions of the srl_array*.vhd blocks. The final example in this article uses srl_array3.vhd since it best describes the implementation for creating register, MLAB, or block RAM based SRLs.

LIBRARY ieee ;
USE ieee.std_logic_1164.ALL;
USE ieee.std_logic_arith.ALL;
USE ieee.std_logic_unsigned.ALL;

ENTITY SRL_Test IS
generic
(
WIDTH : natural := 4;
SIZE : natural := 4
);
PORT
(
clk : IN std_logic;
reset_n : IN std_logic;
enable : IN std_logic;
indata : IN std_logic_vector(WIDTH-1 downto 0);
outdata : OUT std_logic_vector(WIDTH-1 downto 0)
);
END SRL_Test;

ARCHITECTURE struct OF SRL_Test IS
-- Internal signal declarations
COMPONENT srl_array3
GENERIC (
WIDTH : natural := 4;
SIZE : natural := 4
);
PORT (
clk : IN std_logic;
reset_n : IN std_logic;
enable : IN std_logic;
srl_i : IN std_logic_vector(WIDTH-1 downto 0);
srl_o : OUT std_logic_vector(WIDTH-1 downto 0)
);
END COMPONENT;

BEGIN

srl_array_inst : srl_array3
GENERIC MAP (
WIDTH => WIDTH,
SIZE => SIZE
)
PORT MAP (
clk => clk,
reset_n => reset_n,
enable => enable,
srl_i => indata,
srl_o => outdata
);

END struct;

And finally, the code for srl_array3.vhd.

library ieee;
use ieee.std_logic_1164.all;
USE ieee.std_logic_arith.ALL;
USE ieee.std_logic_unsigned.ALL;
use ieee.numeric_std.all;

entity srl_array3 is
generic
(
WIDTH : natural := 1;
SIZE : natural := 16
);
port
(
clk : in std_logic;
reset_n : in std_logic;
enable : in std_logic;
srl_i : in std_logic_vector(WIDTH-1 downto 0);
srl_o : out std_logic_vector(WIDTH-1 downto 0)
);
end entity;

architecture rtl of srl_array3 is

attribute ramstyle : string;
type SRLARR is array (SIZE-2 downto 0) of std_logic_vector(WIDTH-1 downto 0);
signal sreg_reg : SRLARR;
attribute ramstyle of sreg_reg : signal is "logic";
signal sreg_mlab : SRLARR;
attribute ramstyle of sreg_mlab : signal is "MLAB,no_rw_check";
signal sreg_bram : SRLARR;
attribute ramstyle of sreg_bram : signal is "M20K";

signal waddr : integer range SIZE-2 downto 0;
signal waddr_m_addr : integer range SIZE-2 downto 0;

begin

waddr_m_addr <= waddr-SIZE;

process (clk) begin
if (rising_edge(clk)) then
if (enable = '1') then
sreg_bram(waddr) <= srl_i;
sreg_mlab(waddr) <= srl_i;
waddr <= waddr + 1;
end if;
end if;

if (rising_edge(clk)) then
if (enable = '1') then
if (SIZE * WIDTH >= 1024) then
srl_o <= sreg_bram(waddr_m_addr);
elsif (SIZE * WIDTH >= 41) then
srl_o <= sreg_mlab(waddr_m_addr);
else
sreg_reg(0) <= srl_i;
sreg_reg(SIZE-2 downto 1) <= sreg_reg(SIZE-3 downto 0);
srl_o <= sreg_reg(SIZE-2);
end if;
end if;
end if;

end process;

end rtl;

Here is the sdc file used for the design.

# Constrain clock port clk with a 10-ns requirement
create_clock -period 20 [get_ports inclk]

# Automatically apply a generate clock on the output of phase-locked loops (PLLs)
# This command can be safely left in the SDC even if no PLLs exist in the design
derive_pll_clocks

The following is the qsf file used for the design to map SRLs into registers, MLABs, or block RAM. Notice that a logic lock experiment was attempted for instance 4 of SRL_Test. For instance 4, an attempt was made to utilize all 40 registers within one LAB element. The experiment proves that a user can pack sall 40 SRL registers into the same LAB.

set_global_assignment -name FAMILY "Stratix V"
set_global_assignment -name DEVICE 5SGXMA5K2F40C3
set_global_assignment -name TOP_LEVEL_ENTITY Stratix5_Top
set_global_assignment -name ORIGINAL_QUARTUS_VERSION "11.1 SP2.DP8"
set_global_assignment -name PROJECT_CREATION_TIME_DATE "10:59:54 APRIL 13, 2012"
set_global_assignment -name LAST_QUARTUS_VERSION "12.1 SP0.DP4"
set_global_assignment -name MIN_CORE_JUNCTION_TEMP 0
set_global_assignment -name MAX_CORE_JUNCTION_TEMP 85
set_global_assignment -name ERROR_CHECK_FREQUENCY_DIVISOR 256
set_global_assignment -name PARTITION_NETLIST_TYPE SOURCE -section_id Top
set_global_assignment -name PARTITION_FITTER_PRESERVATION_LEVEL PLACEMENT_AND_ROUTING -section_id Top
set_global_assignment -name PARTITION_COLOR 16764057 -section_id Top
set_global_assignment -name STRATIX_DEVICE_IO_STANDARD "2.5 V"
set_global_assignment -name NUM_PARALLEL_PROCESSORS 1
set_global_assignment -name POWER_PRESET_COOLING_SOLUTION "23 MM HEAT SINK WITH 200 LFPM AIRFLOW"
set_global_assignment -name POWER_BOARD_THERMAL_MODEL "NONE (CONSERVATIVE)"
set_global_assignment -name EDA_SIMULATION_TOOL "ModelSim-Altera (VHDL)"
set_global_assignment -name EDA_OUTPUT_DATA_FORMAT VHDL -section_id eda_simulation
set_global_assignment -name LL_ENABLED ON -section_id "SRL_Test:SRL_Test_inst4"
set_global_assignment -name LL_AUTO_SIZE OFF -section_id "SRL_Test:SRL_Test_inst4"
set_global_assignment -name LL_STATE LOCKED -section_id "SRL_Test:SRL_Test_inst4"
set_global_assignment -name LL_RESERVED OFF -section_id "SRL_Test:SRL_Test_inst4"
set_global_assignment -name LL_SECURITY_ROUTING_INTERFACE OFF -section_id "SRL_Test:SRL_Test_inst4"
set_global_assignment -name LL_IGNORE_IO_BANK_SECURITY_CONSTRAINT OFF -section_id "SRL_Test:SRL_Test_inst4"
set_global_assignment -name LL_PR_REGION OFF -section_id "SRL_Test:SRL_Test_inst4"
set_global_assignment -name LL_WIDTH 1 -section_id "SRL_Test:SRL_Test_inst4"
set_global_assignment -name LL_HEIGHT 1 -section_id "SRL_Test:SRL_Test_inst4"
set_global_assignment -name LL_ORIGIN X167_Y34 -section_id "SRL_Test:SRL_Test_inst4"
set_instance_assignment -name LL_MEMBER_OF "SRL_Test:SRL_Test_inst4" -to "SRL_Test:SRL_Test_inst4" -section_id "SRL_Test:SRL_Test_inst4"
set_global_assignment -name VHDL_FILE srl_array3.vhd
set_global_assignment -name VHDL_FILE srl_array2.vhd
set_global_assignment -name VHDL_FILE Stratix5_Top.vhd
set_global_assignment -name VHDL_FILE SRL_Test.vhd
set_global_assignment -name VHDL_FILE srl_array.vhd
set_global_assignment -name QIP_FILE SR_FLEXMEM.qip
set_instance_assignment -name PARTITION_HIERARCHY root_partition -to | -section_id Top

Here is the ChipPlanner picture of SRL_Test instance 4.

mik_Intel_0-1594331807394.png

 

 

Attachments
Version history
Last update:
‎07-09-2020 02:57 PM
Updated by: