Configure the Core

The C-class core is highly parameterized and configurable. By changing a single configuration the user can generate a core instance randing in size from embedded micro-controllers to Linux capable high-performance cores.

ISA Level Configurations

In RISC-V both, the Unprivileged and the Privileged specs both offer a great amount of choices to configure an implementation with. The Unprivileged spec offers various extensions and sub-extensions like Multiply-divide, Floating Point, Atomic, Compressed, etc which a user can choose to implement or not.

The Unprivileged Spec on the other hand provides a much more larger space of configurability to the user. Apart from choosing which privilege modes to implement (Machine, Hypervisor, Supervisor or User), the spec also provides a huge number of Control and Status Registers (CSRs) which impact various aspects of the RISC-V system. For example the MISA csr can be used to dynamically enable or disable execution of certain sub-extensions. Similarly, the valid and legal values of the satp.mode fields indicate what paging schemes are supported by the underlying implementation.

To capture all such possible choices of the RISC-V ISA in a single standard format, InCore has proposed the RISCV-CONFIG YAML format, which has also been adopted by the riscv-community, primarily for the ISA compatibility framework. The core generator uses the same YAML inputs to control various ISA level features of the core.

Generating CSRs

For implementing the CSR module, C-Class uses the CSR-BOX utility to automatically create a bsv module which implements all the necessary CSRs as per the input YAML specification provided in riscv-config format. An example of the isa YAML is provided in the sample_config directory. . CSR-BOX ensures the warl functions specified in the YAML are faithfullty replicated in bsv. Along with CSRs CSR-BOX also provides methods and logic to handle traps and xRET instructions based on the privileged modes (U, S, H) defined in the ISA node of the input yaml.

Note that the CSR-BOX allows one to split the CSRs into a daisy-chain like fashion to reduce the impact on timing when instantiating large number of CSRs. Thus, apart from the isa yaml, CSR-BOX also requires a grouping yaml file which indicates which daisy-chain unit should contain which set of CSRs.

CSR-BOX also takes in an optional debug spec yaml (as defined by riscv-config) to capture basic debug related information like where the parking loop code of the debug is placed in the memory map. Providing the debug spec, also indicates CSR-BOX to implement the necessary logic for handling custom debug interrupts like halt, resume and step. The Debug csrs must be defined in the debug spec. TODO provide example LINK

CSR-BOX also allows the user to define custom CSRs that may be required by the the implementation. C-Class uses a custom csr to control the enabling/disabling of caches and branch predictors. The details of this CSR are provided here. An example YAML containing the definition of these CSRs which can be fed into CSR-BOX is available in the sample_config directory.

Other Derived Configuration Settings

Other than the CSRs, C-Class derives the following parameters from the input isa yaml

  • The ISA string indicates what extensions be enabled in Hardware and its associated collaterals
  • The max value in the supported_xlen node indicates the xlen variable in C-Class. This is used to defined the width of the integer register file, alu operations, bypass width, virtual address size, etc.
  • The flen variable in C-Class is set based on the presence of ‘F’ or ‘D’ characters in the ISA string.
  • If the ‘S’ extension is present in the ISA string, then C-Class detects the supervisor page translation mode to be implemented by detecting the max legal values of the satp.mode csr field present in the input yaml
  • The asid length to be used in the implementation is also derived by checking legal values of the satp.asid csr field.
  • The size of the physical address to be implemented is derived from the physical_addr_sz node of the isa yaml
  • The number of mhpmcounters (and therefore mhpmevents) and their behavior is also captured from the csrs defined in the input isa yaml
  • the number of pmp entries and granularity is also captured from the input isa yaml.
  • custom interrupts/exceptions and their cause values are also captured from the input isa yaml. The implementation creates an entry in the defines file with for the name and cause value. The usage of these custom causes need to be implemented separately in the bsv code.
  • The max size of the cause field in the mcause csr is also derived by checking for the max cause value being used after accounting for the custom interrupts and exceptions.

Micro-Architectural Configuration hooks

The C-Class core has also defined a custom schema to control various micro-architectural features of the core. A sample configuration file is available in the sample_config directory.

The following provides a list and description of the configuration hooks available at the micro-architectural level. Note, there are also hooks in this configuration which control the bluespec compilation commands and the verilator commands as well.

num_harts

Description: Total number of harts to be instantiated in the dummy test-soc. Note that these will non-coherent cores simply acting as masters on the fast-bus.

Examples:

num_harts: 2

isb_sizes

Description: A dictionary controlling the size of the inter-stage buffers of the pipeline. The variable isb_s0s1 controls the size of the isb between stage0 and stage1. Similarly isb_s1s2 dictates the size of the isb between stage1 and stage2 and so on. By increasing isb_s0s1 and isb_s1s2 one can shadow the stalls or latencies in the backend stages of the pipeline by fetching more instructions into the front-end stages of the pipeline.

There is a restriction however that isb_s2s3 should always be 1. This is because the outputs of register file accessed in stage2 are not buffered and niether is the bypass scheme implemented to handle this scenario.

One can however increase the number of in-flight instructions by increasing the sizes of isb_s3s4 and isb_s4s5 (increasing isb_s3s4 has a larger impact).

Also note that if write-after-write stalls are disabled , the size of the wawid is defined by the sum of isb_s3s4 and isb_s4s5. Therefore, increasing in-flight instructions caused a logarithmic increase in the wawid used for maintaining bypass of operands.

Examples:

isb_sizes :
  isb_s0s1: 2
  isb_s1s2: 2
  isb_s2s3: 1
  isb_s3s4: 2
  isb_s4s5: 2

merged_rf

Description: Boolean field to indicate if the architectural registerfiles for floating and integer should be implemented as a single extended regfile in hw or as separate. This field only makes sense ‘F’ support is enabled in the ISA string of the input isa yaml. Under certain targets like FPGA or certain technologies maintaining a single registerfile might lead to better area and timing savings.

Examples:

merged_rf: True

total_events

Description: This field indicates the total number of events that can be used to program the mhpm counters. This field is used to capture the size of the events signals that drives the counters.

Examples:

total_events: 28

waw_stalls

Description: Indicates if stalls must occur on a WAW hazard. If you are looking for higher performance set this to False. Setting this to true would lead to instructions stalling in stage3 due to a WAW hazard.

Setting this to false also means the scoreboad will not allocate a unique id to the destination register of every instruction that is offloaded for execution. The size of this id depends on the numbr of in-flight instructions after the execution stage, which in turn depends on the size of the isb_s3s4 and isb_s4s5 as defined above.

Examples:

waw_stalls: False

iepoch_size

Description: integer value indicating the size of the epochs for the instruction memory subsystem. Allowed value is 2 only

Examples:

iepoch_size: 2

depoch_size

Description: integer value indicating the size of the epochs for the data memory subsystem. Allowed value is 1 only

Examples:

depoch_size: 1

s_extension

Description: Describes various supervisor and MMU related parameters. These parameters only take effect when “S” is present in the ISA field.

  • itlb_size: integer indicating the size of entries in the fully-associative Instruction TLB
  • dtlb_size: integer indicating the size of entries in the fully-associative Data TLB

Examples:

s_extension:
  itlb_size: 4
  dtlb_size: 4

a_extension

Description: Describes various A-extension related parameters. These params take effect only when the “A” extension is enabled in the riscv_config ISA

  • reservation_size: integer indicate the size of the reservation in terms of bytes. Minimum value is 4 and must be a power of 2. For RV64 system minimum should be 8 bytes.

Examples:

a_extension:
  reservation_size: 8

m_extension

Description: Describes various M-extension related parameters. These parameters take effect only is “M” is present in the ISA field. The multiplier used in the core is a retimed one. The parameters below indicate the number of input and output registers around the combo block to enable retiming.

  • mul_stages_out: Number of stages to be inserted after the multiplier combinational block. Minimum value is 1.
  • mul_stages_in: Number of stages to be inserted before the multiplier combinational block. Minimum value is 0
  • div_stages: an integer indicating the number of cycles for a single division operation. Max value is limited to the XLEN defined in the ISA.

Examples:

m_extension:
  mul_stages_in  : 2
  mul_stages_out : 2
  div_stages: 32

branch_predictor

Description: Describes various branch predictor related parameters.

  • instantiate: boolean value indicating if the predictor needs to be instantiated
  • predictor: string indicating the type of predictor to be implemented. Valid values are: ‘gshare’ not. Valid values are : [‘enable’,’disable’]
  • btb_depth: integer indicating the size of the branch target buffer
  • bht_depth: integer indicating the size of the bracnh history buffer
  • history_len: integer indicating the size of the global history register
  • history_bits: integer indicating the number of bits used for indexing bht/btb.
  • ras_depth: integer indicating the size of the return address stack.

Examples:

branch_predictor:
  instantiate: True
  predictor: gshare
  btb_depth: 32
  bht_depth: 512
  history_len: 8
  history_bits: 5
  ras_depth: 8

icache_configuration

Description: Describes the various instruction cache related features.

  • instantiate: boolean value indicating if the predictor needs to be instantiated not. Valid values are : [‘enable’,’disable’]
  • sets: integer indicating the number of sets in the cache
  • word_size: integer indicating the number of bytes in a word. Fixed to 4.
  • block_size: integer indicating the number of words in a cache-block.
  • ways: integer indicating the number of the ways in the cache
  • fb_size: integer indicating the number of fill-buffer entries in the cache
  • replacement: strings indicating the replacement policy. Valid values are: [“PLRU”, “RR”, “Random”]
  • ecc_enable: boolean field indicating if ECC should be enabled on the cache.
  • one_hot_select: boolean value indicating if the bsv one-hot selection funcion should be used of conventional for-loops to choose amongst lines/fb-lines. Choice of this has no affect on the functionality

If supervisor is enabled then the max size of a single way should not exceed 4Kilo Bytes

Examples:

icache_configuration:
  instantiate: True
  sets: 4
  word_size: 4
  block_size: 16
  ways: 4
  fb_size: 4
  replacement: "PLRU"
  ecc_enable: false
  one_hot_select: false

dcache_configuration

Description: Describes the various instruction cache related features.

  • instantiate: boolean value indicating if the predictor needs to be instantiated not. Valid values are : [‘enable’,’disable’]
  • sets: integer indicating the number of sets in the cache
  • word_size: integer indicating the number of bytes in a word. Fixed to 4.
  • block_size: integer indicating the number of words in a cache-block.
  • ways: integer indicating the number of the ways in the cache
  • fb_size: integer indicating the number of fill-buffer entries in the cache
  • sb_size: integer indicating the number of store-buffer entries in the cache. Fixed to 2
  • lb_size: integer indicating the number lines to be stored in the store buffer. Applicable only when rwports == 1r1w
  • ib_Size: integer indicating the number of io-buffer entries in the cache. Default to 2
  • replacement: strings indicating the replacement policy. Valid values are: [“PLRU”, “RR”, “Random”]
  • ecc_enable: boolean field indicating if ECC should be enabled on the cache.
  • one_hot_select: boolean value indicating if the bsv one-hot selection funcion should be used of conventional for-loops to choose amongst lines/fb-lines. Choice of this has no affect on the functionality
  • rwports: number of read-write ports available on the brams. Allowed values are 1rw, 1r1w and 2rw

If supervisor is enabled then the max size of a single way should not exceed 4Kilo Bytes

Examples:

dcache_configuration:
  instantiate: True
  sets: 4
  word_size: 4
  block_size: 16
  ways: 4
  fb_size: 4
  sb_size: 2
  lb_size: 2
  ib_size: 2
  replacement: "PLRU"
  ecc_enable: false
  one_hot_select: false
  rwports: 1r1w

reset_pc

Description: Integer value indicating the reset value of program counter

Example:

bus_protocol

Description: bus protocol for the master interfaces of the core. Fixed to “AXI4”

Examples:

bus_protocol: AXI4
fpu_trap

Description: Boolean value indicating if the core should trap on floating point exception and integer divide-by-zero conditions.

Examples:

fpu_trap: False

verilator_configuration

Description: describes the various configurations for verilator compilation.

  • coverage: indicates the type of coverage that the user would like to track. Valid values are: [“none”, “line”, “toggle”, “all”]
  • trace: boolean value indicating if vcd dumping should be enabled.
  • threads: an integer field indicating the number of threads to be used during simulation
  • verbosity: a boolean field indicating of the verbose/display statements in the generated verilog should be compiled or not.
  • out_dir: name of the directory where the final executable will be dumped.
  • sim_speed: indicates if the user would prefer a fast simulation or slow simulation. Valid values are : [“fast”,”slow”]. Please selecting “fast” will speed up simulation but slow down compilation, while selecting “slow” does the opposite.

Examples:

verilator_configuration:
  coverage: "none"
  trace: False
  threads: 1
  verbosity: True
  open_ocd: False
  sim_speed: fast

bsc_compile_options

Description: Describes the various bluespec compile options

  • test_memory_size: size of the BRAM memory in the test-SoC in bytes.
    Default is 32MB
  • assertions: boolean value indicating if assertions used in the design should be compiled or not
  • trace_dump: boolean value indicating if the logic to generate a simple trace should be implemented or not. Note this is only for simulation and not a real trace
  • compile_target: a string indicating if the bsv files are being compiled for simulation of for asic/fpga synthesis. The valid values are: [ ‘sim’, ‘asic’, ‘fpga’ ]
  • suppress_warnings: List of warnings which can be suppressed during bluespec compilation. Valid values are: [“none”, “all”, “G0010”, “T0054”, “G0020”, “G0024”, “G0023”, “G0096”, “G0036”, “G0117”, “G0015”]
  • ovl_assertions: boolean value indicating if OVL based assertions must be turned on/off
  • ovl_path: string indicating the path where the OVL library is installed.
  • sva_assertions: boolean value indicating if SVA based assertions must be turned on/off
  • verilog_dir: the directory name of where the generated verilog will be dumped
  • open_ocd: a boolean field indicating if the test-bench should have an open-ocd vpi enabled.
  • build_dir: the directory name where the bsv build files will be dumped
  • top_module: name of the top-level bluespec module to be compiled.
  • top_file: file containing the top-level module.
  • top_dir: directory containing the top_file.
  • cocotb_sim: boolean variable. When set the terminating conditions in the test-bench environments are disabled, as the cocotb environment is meant to handle that. When set to false, the bluespect test-bench holds the terminating conditions.

Examples:

bsc_compile_options:
  assertions: True
  trace_dump: True
  suppress_warnings: "none"
  top_module: mkTbSoc
  top_file: TbSoc
  top_dir: base_sim
  out_dir: bin

noinline_modules

Description: This node contains multiple module names which take a boolean value. Setting a module to True would generate a separate verilog file for that module during bluespec compilation. If set to False, then that particular module will be in lined the module above it in hierarchy in the generated verilog.

Examples:

noinline_modules:
  stage0: False
  stage1: True
  stage2: False
  stage3: False