Configure the Core¶
The C-class core is highly parameterized and configurable. By changing a single configuration the user can generate a core instance randing in size from embedded micro-controllers to Linux capable high-performance cores.
ISA Level Configurations¶
In RISC-V both, the Unprivileged and the Privileged specs both offer a great amount of choices to configure an implementation with. The Unprivileged spec offers various extensions and sub-extensions like Multiply-divide, Floating Point, Atomic, Compressed, etc which a user can choose to implement or not.
The Unprivileged Spec on the other hand provides a much more larger space of configurability to the user. Apart from choosing which privilege modes to implement (Machine, Hypervisor, Supervisor or User), the spec also provides a huge number of Control and Status Registers (CSRs) which impact various aspects of the RISC-V system. For example the MISA csr can be used to dynamically enable or disable execution of certain sub-extensions. Similarly, the valid and legal values of the satp.mode fields indicate what paging schemes are supported by the underlying implementation.
To capture all such possible choices of the RISC-V ISA in a single standard format, InCore has proposed the RISCV-CONFIG YAML format, which has also been adopted by the riscv-community, primarily for the ISA compatibility framework. The core generator uses the same YAML inputs to control various ISA level features of the core.
Generating CSRs¶
For implementing the CSR module, C-Class uses the CSR-BOX utility to automatically create a bsv module which implements all the necessary CSRs as per the input YAML specification provided in riscv-config format. An example of the isa YAML is provided in the sample_config directory. . CSR-BOX ensures the warl functions specified in the YAML are faithfullty replicated in bsv. Along with CSRs CSR-BOX also provides methods and logic to handle traps and xRET instructions based on the privileged modes (U, S, H) defined in the ISA node of the input yaml.
Note that the CSR-BOX allows one to split the CSRs into a daisy-chain like fashion to reduce the impact on timing when instantiating large number of CSRs. Thus, apart from the isa yaml, CSR-BOX also requires a grouping yaml file which indicates which daisy-chain unit should contain which set of CSRs.
CSR-BOX also takes in an optional debug spec yaml (as defined by riscv-config) to capture basic debug related information like where the parking loop code of the debug is placed in the memory map. Providing the debug spec, also indicates CSR-BOX to implement the necessary logic for handling custom debug interrupts like halt, resume and step. The Debug csrs must be defined in the debug spec. TODO provide example LINK
CSR-BOX also allows the user to define custom CSRs that may be required by the the implementation. C-Class uses a custom csr to control the enabling/disabling of caches and branch predictors. The details of this CSR are provided here. An example YAML containing the definition of these CSRs which can be fed into CSR-BOX is available in the sample_config directory.
Other Derived Configuration Settings¶
Other than the CSRs, C-Class derives the following parameters from the input isa yaml
- The ISA string indicates what extensions be enabled in Hardware and its associated collaterals
- The max value in the supported_xlen node indicates the xlen variable in C-Class. This is used to defined the width of the integer register file, alu operations, bypass width, virtual address size, etc.
- The flen variable in C-Class is set based on the presence of ‘F’ or ‘D’ characters in the ISA string.
- If the ‘S’ extension is present in the ISA string, then C-Class detects the supervisor page translation mode to be implemented by detecting the max legal values of the satp.mode csr field present in the input yaml
- The asid length to be used in the implementation is also derived by checking legal values of the satp.asid csr field.
- The size of the physical address to be implemented is derived from the physical_addr_sz node of the isa yaml
- The number of mhpmcounters (and therefore mhpmevents) and their behavior is also captured from the csrs defined in the input isa yaml
- the number of pmp entries and granularity is also captured from the input isa yaml.
- custom interrupts/exceptions and their cause values are also captured from the input isa yaml. The implementation creates an entry in the defines file with for the name and cause value. The usage of these custom causes need to be implemented separately in the bsv code.
- The max size of the cause field in the mcause csr is also derived by checking for the max cause value being used after accounting for the custom interrupts and exceptions.
Micro-Architectural Configuration hooks¶
The C-Class core has also defined a custom schema to control various micro-architectural features of the core. A sample configuration file is available in the sample_config directory.
The following provides a list and description of the configuration hooks available at the micro-architectural level. Note, there are also hooks in this configuration which control the bluespec compilation commands and the verilator commands as well.
num_harts¶
Description: Total number of harts to be instantiated in the dummy test-soc. Note that these will non-coherent cores simply acting as masters on the fast-bus.
Examples:
num_harts: 2
isb_sizes¶
Description: A dictionary controlling the size of the inter-stage buffers of the pipeline. The variable isb_s0s1 controls the size of the isb between stage0 and stage1. Similarly isb_s1s2 dictates the size of the isb between stage1 and stage2 and so on. By increasing isb_s0s1 and isb_s1s2 one can shadow the stalls or latencies in the backend stages of the pipeline by fetching more instructions into the front-end stages of the pipeline.
There is a restriction however that isb_s2s3 should always be 1. This is because the outputs of register file accessed in stage2 are not buffered and niether is the bypass scheme implemented to handle this scenario.
One can however increase the number of in-flight instructions by increasing the sizes of isb_s3s4 and isb_s4s5 (increasing isb_s3s4 has a larger impact).
Also note that if write-after-write stalls are disabled , the size of the wawid is defined by the sum of isb_s3s4 and isb_s4s5. Therefore, increasing in-flight instructions caused a logarithmic increase in the wawid used for maintaining bypass of operands.
Examples:
isb_sizes : isb_s0s1: 2 isb_s1s2: 2 isb_s2s3: 1 isb_s3s4: 2 isb_s4s5: 2
merged_rf¶
Description: Boolean field to indicate if the architectural registerfiles for floating and integer should be implemented as a single extended regfile in hw or as separate. This field only makes sense ‘F’ support is enabled in the ISA string of the input isa yaml. Under certain targets like FPGA or certain technologies maintaining a single registerfile might lead to better area and timing savings.
Examples:
merged_rf: True
total_events¶
Description: This field indicates the total number of events that can be used to program the mhpm counters. This field is used to capture the size of the events signals that drives the counters.
Examples:
total_events: 28
waw_stalls¶
Description: Indicates if stalls must occur on a WAW hazard. If you are looking for higher performance set this to False. Setting this to true would lead to instructions stalling in stage3 due to a WAW hazard.
Setting this to false also means the scoreboad will not allocate a unique id to the destination register of every instruction that is offloaded for execution. The size of this id depends on the numbr of in-flight instructions after the execution stage, which in turn depends on the size of the isb_s3s4 and isb_s4s5 as defined above.
Examples:
waw_stalls: False
iepoch_size¶
Description: integer value indicating the size of the epochs for the instruction memory subsystem. Allowed value is 2 only
Examples:
iepoch_size: 2
depoch_size¶
Description: integer value indicating the size of the epochs for the data memory subsystem. Allowed value is 1 only
Examples:
depoch_size: 1
s_extension¶
Description: Describes various supervisor and MMU related parameters. These parameters only take effect when “S” is present in the ISA field.
itlb_size
: integer indicating the size of entries in the fully-associative Instruction TLBdtlb_size
: integer indicating the size of entries in the fully-associative Data TLBExamples:
s_extension: itlb_size: 4 dtlb_size: 4
a_extension¶
Description: Describes various A-extension related parameters. These params take effect only when the “A” extension is enabled in the riscv_config ISA
reservation_size
: integer indicate the size of the reservation in terms of bytes. Minimum value is 4 and must be a power of 2. For RV64 system minimum should be 8 bytes.Examples:
a_extension: reservation_size: 8
m_extension¶
Description: Describes various M-extension related parameters. These parameters take effect only is “M” is present in the ISA field. The multiplier used in the core is a retimed one. The parameters below indicate the number of input and output registers around the combo block to enable retiming.
mul_stages_out
: Number of stages to be inserted after the multiplier combinational block. Minimum value is 1.mul_stages_in
: Number of stages to be inserted before the multiplier combinational block. Minimum value is 0div_stages
: an integer indicating the number of cycles for a single division operation. Max value is limited to the XLEN defined in the ISA.Examples:
m_extension: mul_stages_in : 2 mul_stages_out : 2 div_stages: 32
branch_predictor¶
Description: Describes various branch predictor related parameters.
instantiate
: boolean value indicating if the predictor needs to be instantiatedpredictor
: string indicating the type of predictor to be implemented. Valid values are: ‘gshare’ not. Valid values are : [‘enable’,’disable’]btb_depth
: integer indicating the size of the branch target bufferbht_depth
: integer indicating the size of the bracnh history bufferhistory_len
: integer indicating the size of the global history registerhistory_bits
: integer indicating the number of bits used for indexing bht/btb.ras_depth
: integer indicating the size of the return address stack.Examples:
branch_predictor: instantiate: True predictor: gshare btb_depth: 32 bht_depth: 512 history_len: 8 history_bits: 5 ras_depth: 8
icache_configuration¶
Description: Describes the various instruction cache related features.
instantiate
: boolean value indicating if the predictor needs to be instantiated not. Valid values are : [‘enable’,’disable’]sets
: integer indicating the number of sets in the cacheword_size
: integer indicating the number of bytes in a word. Fixed to 4.block_size
: integer indicating the number of words in a cache-block.ways
: integer indicating the number of the ways in the cachefb_size
: integer indicating the number of fill-buffer entries in the cachereplacement
: strings indicating the replacement policy. Valid values are: [“PLRU”, “RR”, “Random”]ecc_enable
: boolean field indicating if ECC should be enabled on the cache.one_hot_select
: boolean value indicating if the bsv one-hot selection funcion should be used of conventional for-loops to choose amongst lines/fb-lines. Choice of this has no affect on the functionalityIf supervisor is enabled then the max size of a single way should not exceed 4Kilo Bytes
Examples:
icache_configuration: instantiate: True sets: 4 word_size: 4 block_size: 16 ways: 4 fb_size: 4 replacement: "PLRU" ecc_enable: false one_hot_select: false
dcache_configuration¶
Description: Describes the various instruction cache related features.
instantiate
: boolean value indicating if the predictor needs to be instantiated not. Valid values are : [‘enable’,’disable’]sets
: integer indicating the number of sets in the cacheword_size
: integer indicating the number of bytes in a word. Fixed to 4.block_size
: integer indicating the number of words in a cache-block.ways
: integer indicating the number of the ways in the cachefb_size
: integer indicating the number of fill-buffer entries in the cachesb_size
: integer indicating the number of store-buffer entries in the cache. Fixed to 2lb_size
: integer indicating the number lines to be stored in the store buffer. Applicable only when rwports == 1r1wib_Size
: integer indicating the number of io-buffer entries in the cache. Default to 2replacement
: strings indicating the replacement policy. Valid values are: [“PLRU”, “RR”, “Random”]ecc_enable
: boolean field indicating if ECC should be enabled on the cache.one_hot_select
: boolean value indicating if the bsv one-hot selection funcion should be used of conventional for-loops to choose amongst lines/fb-lines. Choice of this has no affect on the functionalityrwports
: number of read-write ports available on the brams. Allowed values are 1rw, 1r1w and 2rwIf supervisor is enabled then the max size of a single way should not exceed 4Kilo Bytes
Examples:
dcache_configuration: instantiate: True sets: 4 word_size: 4 block_size: 16 ways: 4 fb_size: 4 sb_size: 2 lb_size: 2 ib_size: 2 replacement: "PLRU" ecc_enable: false one_hot_select: false rwports: 1r1w
reset_pc¶
Description: Integer value indicating the reset value of program counter
Example:
bus_protocol¶
Description: bus protocol for the master interfaces of the core. Fixed to “AXI4”
Examples:
bus_protocol: AXI4
- fpu_trap
Description: Boolean value indicating if the core should trap on floating point exception and integer divide-by-zero conditions.
Examples:
fpu_trap: False
verilator_configuration¶
Description: describes the various configurations for verilator compilation.
coverage
: indicates the type of coverage that the user would like to track. Valid values are: [“none”, “line”, “toggle”, “all”]trace
: boolean value indicating if vcd dumping should be enabled.threads
: an integer field indicating the number of threads to be used during simulationverbosity
: a boolean field indicating of the verbose/display statements in the generated verilog should be compiled or not.out_dir
: name of the directory where the final executable will be dumped.sim_speed
: indicates if the user would prefer a fast simulation or slow simulation. Valid values are : [“fast”,”slow”]. Please selecting “fast” will speed up simulation but slow down compilation, while selecting “slow” does the opposite.Examples:
verilator_configuration: coverage: "none" trace: False threads: 1 verbosity: True open_ocd: False sim_speed: fast
bsc_compile_options¶
Description: Describes the various bluespec compile options
test_memory_size
: size of the BRAM memory in the test-SoC in bytes.- Default is 32MB
assertions
: boolean value indicating if assertions used in the design should be compiled or nottrace_dump
: boolean value indicating if the logic to generate a simple trace should be implemented or not. Note this is only for simulation and not a real tracecompile_target
: a string indicating if the bsv files are being compiled for simulation of for asic/fpga synthesis. The valid values are: [ ‘sim’, ‘asic’, ‘fpga’ ]suppress_warnings
: List of warnings which can be suppressed during bluespec compilation. Valid values are: [“none”, “all”, “G0010”, “T0054”, “G0020”, “G0024”, “G0023”, “G0096”, “G0036”, “G0117”, “G0015”]ovl_assertions
: boolean value indicating if OVL based assertions must be turned on/offovl_path
: string indicating the path where the OVL library is installed.sva_assertions
: boolean value indicating if SVA based assertions must be turned on/offverilog_dir
: the directory name of where the generated verilog will be dumpedopen_ocd
: a boolean field indicating if the test-bench should have an open-ocd vpi enabled.build_dir
: the directory name where the bsv build files will be dumpedtop_module
: name of the top-level bluespec module to be compiled.top_file
: file containing the top-level module.top_dir
: directory containing the top_file.
cocotb_sim
: boolean variable. When set the terminating conditions in the test-bench environments are disabled, as the cocotb environment is meant to handle that. When set to false, the bluespect test-bench holds the terminating conditions.Examples:
bsc_compile_options: assertions: True trace_dump: True suppress_warnings: "none" top_module: mkTbSoc top_file: TbSoc top_dir: base_sim out_dir: bin
noinline_modules¶
Description: This node contains multiple module names which take a boolean value. Setting a module to True would generate a separate verilog file for that module during bluespec compilation. If set to False, then that particular module will be in lined the module above it in hierarchy in the generated verilog.
Examples:
noinline_modules: stage0: False stage1: True stage2: False stage3: False