Skip to content

Preliminary support for trap handling during block emulation #463

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Oct 19, 2024

Conversation

ChinYikMing
Copy link
Collaborator

@ChinYikMing ChinYikMing commented Jun 24, 2024

Trap might occurs during block emulation, for example, page fault. In order to handle trap, we have to temporarily escape from block and jump to the trap handler to handle the trap PC by PC. Once trap is handled, resume the previous execution flow where cause the trap. To support this, we leverage setjmp function to store the state and a flag called is_trapped to indicate the emulation is in a trap handling mode currently. The sret and mret implementation are refactored to control the corresponding CSR before get out from trap.

Some S-mode CSRs are added to riscv_internal to support S-mode. S-mode, and M-mode CSR helper macros are also introduced, so that the CSR can be manipulated in an easier manner.

Related: #310

Copy link
Contributor

@jserv jserv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Benchmarks

Benchmark suite Current: a4f6dab Previous: a28615f Ratio
Dhrystone 1476 Average DMIPS over 10 runs 1591 Average DMIPS over 10 runs 1.08
Coremark 1412.688 Average iterations/sec over 10 runs 1380.125 Average iterations/sec over 10 runs 0.98

This comment was automatically generated by workflow using github-action-benchmark.

@ChinYikMing
Copy link
Collaborator Author

For nested traps concern, I think before entering critical part of kernel control path, the contents of the CSR registers could be saved in the Kernel Mode stack, then restore the contents of the CSR registers when exiting the kernel control path. This way is considered as a software manner.

@jserv jserv marked this pull request as draft June 24, 2024 01:55
@jserv
Copy link
Contributor

jserv commented Jun 24, 2024

please do not merge yet

You can simply convert this pull request to draft, so that it will not be merged.

@ChinYikMing
Copy link
Collaborator Author

According to the "The RISC-V Instruction Set Manual ( Volume I: Unprivileged ISA Document Version 20191213 )", the execution environment interface ( EEI ) may or may NOT guarantee that misaligned loads and stores are fully supported. Thus, I think the emulator maybe no need handling misalignment in hardware manner and just let Linux handle it through exception. In other words, ensuring the emulator set the PC to correct address that stored in stvec CSR.

@jserv
Copy link
Contributor

jserv commented Jun 24, 2024

According to the "The RISC-V Instruction Set Manual ( Volume I: Unprivileged ISA Document Version 20191213 )", the execution environment interface ( EEI ) may or may NOT guarantee that misaligned loads and stores are fully supported. Thus, I think the emulator maybe no need handling misalignment in hardware manner and just let Linux handle it through exception. In other words, ensuring the emulator set the PC to correct address that stored in stvec CSR.

The reason rv32emu includes an option for misalignment handling is to align its behavior with Spike, allowing the two implementations to be compared. You should evaluate the minimal and necessary changes.

@ChinYikMing
Copy link
Collaborator Author

According to the "The RISC-V Instruction Set Manual ( Volume I: Unprivileged ISA Document Version 20191213 )", the execution environment interface ( EEI ) may or may NOT guarantee that misaligned loads and stores are fully supported. Thus, I think the emulator maybe no need handling misalignment in hardware manner and just let Linux handle it through exception. In other words, ensuring the emulator set the PC to correct address that stored in stvec CSR.

The reason rv32emu includes an option for misalignment handling is to align its behavior with Spike, allowing the two implementations to be compared. You should evaluate the minimal and necessary changes.

Got it. The current implementation of misaligned handling is through trap and set PC according to the stvec or mtvec CSR which in my expectation instead of handled by rv32emu itself. So that, the handling of misaligned behavior whether it results in a fatal error or is manageable depends on the implementation of the handler. Some simple test suites will be introduced to test the changes in this PR.

@ChinYikMing ChinYikMing force-pushed the trap branch 2 times, most recently from 9d7f96d to a2c759e Compare June 25, 2024 18:13
@@ -135,7 +188,8 @@ RV_EXCEPTION_LIST
rv->compressed = compress; \
rv->csr_cycle = cycle; \
rv->PC = PC; \
rv_except_##type##_misaligned(rv, IIF(IO)(addr, mask_or_pc)); \
rv->is_trapped = true; \
rv_trap_##type##_misaligned(rv, IIF(IO)(addr, mask_or_pc)); \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we regard misalignment as a special case? Or, can it be generalized?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May I know what does the special case refer to?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May I know what does the special case refer to?

I meant to ask whether misalignment exceptions should be treated as one of the supported traps. If so, checks for misalignment exceptions should be implemented indirectly.

Copy link
Collaborator Author

@ChinYikMing ChinYikMing Oct 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the entry.S of RISCV/Linux v6.1.112, it does support do_trap_load_misaligned and do_trap_store_misaligned. Thus, any misaligned load/store fault should be trapped into same path. In other words, trap_handler should be capable to handle the trap.

Copy link
Contributor

@jserv jserv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add system emulation hook for GitHub Actions.

@ChinYikMing
Copy link
Collaborator Author

As I realizes that we may not need to enable exception or interrupt delegation from M-mode software to S-mode software since we do not tend to leverage the M-mode software ( such as OpenSBI ) to boot Linux kernel. Nonetheless, I would still like to include the medeleg and mideleg CSRs in the changes for a more comprehensive implementation. They are not entirely useless, as the test suite has utilized them from the very beginning of its design.

Any idea for this? @jserv

@jserv
Copy link
Contributor

jserv commented Jun 30, 2024

Nonetheless, I would still like to include the medeleg and mideleg CSRs in the changes for a more comprehensive implementation. They are not entirely useless, as the test suite has utilized them from the very beginning of its design.

I am afraid of the lack of proper test suite from CSR operations, meaning that we can have limited coverage. Meanwhile, I don't have explicit idea to escape from the chicken and egg problem.

@ChinYikMing ChinYikMing force-pushed the trap branch 6 times, most recently from ae94aa1 to 9509721 Compare October 8, 2024 15:30
@ChinYikMing
Copy link
Collaborator Author

According to the "The RISC-V Instruction Set Manual ( Volume I: Unprivileged ISA Document Version 20191213 )", the execution environment interface ( EEI ) may or may NOT guarantee that misaligned loads and stores are fully supported. Thus, I think the emulator maybe no need handling misalignment in hardware manner and just let Linux handle it through exception. In other words, ensuring the emulator set the PC to correct address that stored in stvec CSR.

The reason rv32emu includes an option for misalignment handling is to align its behavior with Spike, allowing the two implementations to be compared. You should evaluate the minimal and necessary changes.

The rv->is_trapped is compiled only when SYSTEM build flag is enabled. Thus, when comparing the misalignment handling between rv32emu and Spike during ISA simulation, the original behavior is maintained.

@ChinYikMing
Copy link
Collaborator Author

Nonetheless, I would still like to include the medeleg and mideleg CSRs in the changes for a more comprehensive implementation. They are not entirely useless, as the test suite has utilized them from the very beginning of its design.

I am afraid of the lack of proper test suite from CSR operations, meaning that we can have limited coverage. Meanwhile, I don't have explicit idea to escape from the chicken and egg problem.

I intend to remove the exception and interrupt delegation logic for now. We can add it back later if needed.

@ChinYikMing
Copy link
Collaborator Author

#498 must be merged first for this PR to pass the CI.

@ChinYikMing ChinYikMing marked this pull request as ready for review October 8, 2024 20:51
#else
#define RVOP_NO_NEXT(ir) (!ir->next)
#define RVOP_NO_NEXT(ir) (!ir->next IIF(RV32_HAS(SYSTEM))(| rv->is_trapped, ))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be (!ir->next IIF(RV32_HAS(SYSTEM))(|| rv->is_trapped, )) instead?

Copy link
Collaborator Author

@ChinYikMing ChinYikMing Oct 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be (!ir->next IIF(RV32_HAS(SYSTEM))(|| rv->is_trapped, )) instead?

Single | has same effect than double || since the all condition are boolean.

Copy link
Collaborator

@vacantron vacantron Oct 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current semantics is equivalent, but I'm worrying about the support of nested interrupt handling in the future. The is-trapped might become a integer as a counter.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current semantics is equivalent, but I'm worrying about the support of nested interrupt handling in the future. The is-trapped might become a integer as a counter.

If used as a counter, the implementation of sret should decrement is_trapped rather than simply setting it to false? Also, other places using is_trapped should increment it?

However, I'm curious in the current emulation, have you observed nested traps occurring, and do these changes need to be made to handle them?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, I'm curious in the current emulation, have you observed nested traps occurring, and do these changes need to be made to handle them?

In current, no (if I'm not mistaken). But we need it if we want to handle the page fault during handling the external interrupt, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, I'm curious in the current emulation, have you observed nested traps occurring, and do these changes need to be made to handle them?

In current, no (if I'm not mistaken). But we need it if we want to handle the page fault during handling the external interrupt, right?

Indeed. However, we currently don't have a test suite for this situation. The best test suite would be the Linux kernel itself, and we might need an asynchronous mechanism (e.g., a signal thread) to periodically intercept the external interrupt.

For simplicity, I would prefer to keep track of this issue and address it only when it becomes necessary.

Copy link
Contributor

@jserv jserv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refine the subject of this pull request according to the proposed changes.

Traps can occur during block emulation, such as from misaligned access.
To handle a trap, we temporarily exit the block and transfer control to
the trap handler, processing each instruction by stepping. After the
trap is resolved, execution resumes from the point where it occurred.
We use a flag called 'is_trapped' to indicate when the emulation is in
trap-handling mode. Additionally, the 'sret' instruction is used to
adjust the relevant supervisor CSR before returning from a trap.

Support for S-mode has been enhanced by adding most S-mode CSRs to
'riscv_internal', along with helper macros for easier manipulation of
S-mode and M-mode CSRs. Although M-mode CSRs are not directly used
during emulation, the helpers have been added for potential future use.
Conditional checks for 'SYSTEM' were added where necessary. The
'LOOKUP_OR_UPDATE_BRANCH_HISTORY_TABLE' functionality is disabled
during trap handling, as traps are managed one instruction at a time
rather than as a chained block.

Related: sysprog21#310
@ChinYikMing ChinYikMing changed the title Support trap handling during block emulation Preliminary support for trap handling during block emulation Oct 18, 2024
@ChinYikMing
Copy link
Collaborator Author

The commits are reordered and squashed.

Place a variable and function at a misaligned address to trigger
all three misaligned traps. The test suite verifies misalignment for
instruction fetch and lw/sw instructions, ensuring the correct trap
handler is invoked. Since our CI integrates the xPack toolchain, we
use it to build the test suite.
@vacantron
Copy link
Collaborator

To handle a trap, we temporarily exit the block and transfer control to the trap handler, processing each instruction by stepping.

Does this change explicitly break the chained block? or just trap into the interrupt service routine and go back to the current when it done?

@ChinYikMing
Copy link
Collaborator Author

To handle a trap, we temporarily exit the block and transfer control to the trap handler, processing each instruction by stepping.

Does this change explicitly break the chained block? or just trap into the interrupt service routine and go back to the current when it done?

The latter.

@jserv jserv merged commit edb5a1b into sysprog21:master Oct 19, 2024
8 checks passed
@jserv jserv added this to the release-2024.2 milestone Oct 19, 2024
vestata pushed a commit to vestata/rv32emu that referenced this pull request Jan 24, 2025
vestata pushed a commit to vestata/rv32emu that referenced this pull request Jan 24, 2025
Preliminary support for trap handling during block emulation
ChinYikMing added a commit to ChinYikMing/rv32emu that referenced this pull request Jan 27, 2025
ChinYikMing added a commit to ChinYikMing/rv32emu that referenced this pull request Jan 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants