This patch introduces `rt_smp_call_request` API to handle queued
requests across cores with user provided data buffer, which provides a
way to request IPI through a non-blocking pattern.
It also resolved several issues in the old implementation:
- Multiple requests from different cores can not be queued in the work
object of the target core.
- Data racing on `rt_smp_work` of same core. If multiple requests came
in turns, or if the call is used by the target cpu, while a new
request is coming, the value will be overwrite.
- Memory vulnerability. The rt_smp_event is allocated on stack, though
the caller may not wait until the call is done.
- API naming problem. Actually we don't provide a way to issue an IPI to
ANY core in mask. What the API do is aligned to MANY pattern.
- FUNC_IPI registering to PIC.
Changes:
- Declared and configured the new `RT_SMP_CALL_IPI` to support
functional IPIs for task requests across cores.
- Replaced the single `rt_smp_work` array with `call_req_cores` to
manage per-core call requests safely.
- Added `_call_req_take` and `_call_req_release` functions for atomic
handling of request lifetimes, preventing data race conditions.
- Replaced single event handling with a queue-based approach
(`call_queue`) for efficient multi-request processing per core.
- Introduced `rt_smp_call_ipi_handler` to process queued requests,
reducing IPI contention by only sending new requests when needed.
- Implemented `_smp_call_remote_request` to handle remote requests
with specific flags, enabling more flexible core-to-core task
signaling.
- Refined `rt_smp_call_req_init` to initialize and track requests
with atomic usage flags, mitigating potential memory vulnerabilities.
Signed-off-by: Shell <smokewood@qq.com>
These changes introduce the rt_interrupt_context family, providing a
mechanism for managing nested interrupts. The context management
ensures proper storage and retrieval of interrupt states, improving
reliability in nested interrupt scenarios by enabling context tracking
across different interrupt levels. This enhancement is essential for
platforms where nested interrupt handling is crucial, such as in real-
time or multi-threaded applications.
Changes:
- Defined rt_interrupt_context structure with context and node fields
in `rtdef.h` to support nested interrupts.
- Added rt_slist_pop function in `rtservice.h` for simplified node
removal in singly linked lists.
- Declared rt_interrupt_context_push, rt_interrupt_context_pop, and
rt_interrupt_context_get functions in `rtthread.h` to manage the
interrupt/exception stack.
- Modified AArch64 CPU support in `cpuport.h` to include
rt_hw_show_register for debugging registers.
- Refactored `_rt_hw_trap_irq` in `trap.c` for context-aware IRQ
handling, with stack push/pop logic to handle nested contexts.
- Implemented interrupt context push, pop, and retrieval logic in
`irq.c` to manage context at the CPU level.
Signed-off-by: Shell <smokewood@qq.com>
This patch improves the atomicity of context switching by ensuring that
the stack pointer (sp) and thread self updates occur simultaneously.
This enhancement is crucial for maintaining thread safety and
preventing potential inconsistencies during context switches.
Changes:
- Modified `cpuport.h` to use `ARM64_THREAD_REG` for thread self access.
- Added an `update_tidr` macro in `context_gcc.S` to streamline thread ID
updates.
- Adjusted `rt_hw_context_switch_to` and `rt_hw_context_switch` to call
`update_tidr`, ensuring atomic updates during context switches.
- Cleaned up `scheduler_mp.c` by removing redundant thread self
assignments.
Signed-off-by: Shell <smokewood@qq.com>
This patch optimizes the user-space context handling in the ARM64
architecture, specifically improving how the context is saved and
restored during system calls and interrupts. The changes make the
code more efficient and easier to maintain, while ensuring proper
preservation of user context during system transitions.
Changes:
- Introduced a parameter for context saving to improve flexibility.
- Replaced hardcoded stack pointer operations with frame-relative
references for better readability and code reuse.
- Simplified context restoration, removing redundant operations like
loading/storing floating-point registers.
Signed-off-by: Shell <smokewood@qq.com>
1. RT_FIELD_PREP: prepare a bitfield element.
2. RT_FIELD_GET: extract a bitfield element.
3. rt_offsetof: member offset of a struct
4. rt_upper_32_bits: high 32 bits of value.
5. rt_lower_32_bits: lower 32 bits of value.
6. rt_upper_16_bits: high 16 bits of value.
7. rt_lower_16_bits: lower 16 bits of value.
8. rt_max_t: fix type of max(...).
9. rt_ilog2: integer logarithm base 2.
Signed-off-by: GuEe-GUI <2991707448@qq.com>
feat: overall implementation of vector irq
This patch generalize the irq handling on up/mp system by adding the
`rt_hw_irq_exit()` & `rt_hw_vector_irq_sched()` API.
Changes:
- Added `rt_hw_irq_exit()` and `rt_hw_vector_irq_sched()` APIs for unified IRQ management.
- Refactored assembly code for both UP and MP systems to use the new IRQ handling flow.
- Removed redundant code and optimized exception handling paths.
Signed-off-by: Shell <smokewood@qq.com>
The rtdef.h is a big header with multiple dependency inside,
which makes it easier to introduce recursion dependency.
Signed-off-by: Shell <smokewood@qq.com>
This patch focuses on the ARM64 general context handling code.
The modifications are aimed at enhancing performance by simplifying
context save/restore operations.
Changes include:
- Adjusted stack alignment in `arch_set_thread_context` function.
- Updated `lwp_gcc.S` to reset frame pointer and link register.
- Refined `rt_hw_backtrace_frame_unwind` to handle user space address checks.
- Added `GET_THREAD_SELF` macro in `asm-generic.h`.
- Simplified context saving/restoring in `context_gcc.h` and related files.
- Optimized `rt_hw_context_switch_interrupt` and related assembly routines.
Signed-off-by: Shell <smokewood@qq.com>
This patch improves the efficiency and readability of the AArch64 common setup
code by calculating the `PV_OFFSET` once at the start and reusing the value.
This change reduces redundant calculations.
Signed-off-by: Shell <smokewood@qq.com>
[feat] Enhance support for backtrace service
rt_backtrace_formatted_print() and rt_backtrace_to_buffer() to help
debug routines.
Also, following modification are included:
- make rt_backtrace_frame patchable with weak attr
- replace lwp backtrace with sync output
Signed-off-by: Shell <smokewood@qq.com>
* [libcpu] arm64: Add hardware thread_self support
This patch introduces hardware-based thread self-identification
for the AArch64 architecture. It optimizes thread management by
using hardware registers to store and access the current thread's
pointer, reducing overhead and improving overall performance.
Changes include:
- Added `ARCH_USING_HW_THREAD_SELF` configuration option.
- Modified `rtdef.h`, `rtsched.h` to conditionally include
`critical_switch_flag` based on the new config.
- Updated context management in `context_gcc.S`, `cpuport.h`
to support hardware-based thread self.
- Enhanced `scheduler_mp.c` and `thread.c` to leverage the new
hardware thread self feature.
These modifications ensure better scheduling and thread handling,
particularly in multi-core environments, by minimizing the
software overhead associated with thread management.
Signed-off-by: Shell <smokewood@qq.com>
* fixup: address suggestion
* fixup: rt_current_thread as global
* scheduler: add cpu object for UP scheduler
Also, maintain the rt_current_thread in cpu object on UP scheduler.
---------
Signed-off-by: Shell <smokewood@qq.com>
* [libcpu/arm64] add C11 atomic ticket spinlock
Replace the former implementation of flag-based spinlock which is unfair
Besides, C11 atomic implementation is more readable (it's C anyway),
and maintainable. Cause toolchain can use their builtin optimization and
tune for different micro-architectures. For example armv8.5 introduces a
better instruction. The compiler can help with that when it knows your
target platform in support of it.
Signed-off-by: Shell <smokewood@qq.com>
* fixup: RT_CPUS_NR
---------
Signed-off-by: Shell <smokewood@qq.com>
* [ofw] dealing with mem region out of kernel space
- Fix parameter checking in _out_of_range() that NULL is excluded for
fixed mapping
- Split page install with a deferred stage to avoid mapping over
ARCH_EARLY_MAP_SIZE
Signed-off-by: Shell <smokewood@qq.com>
* fixup: restrict vstart for using of RT_NULL
---------
Signed-off-by: Shell <smokewood@qq.com>