This patch optimizes the user-space context handling in the ARM64
architecture, specifically improving how the context is saved and
restored during system calls and interrupts. The changes make the
code more efficient and easier to maintain, while ensuring proper
preservation of user context during system transitions.
Changes:
- Introduced a parameter for context saving to improve flexibility.
- Replaced hardcoded stack pointer operations with frame-relative
references for better readability and code reuse.
- Simplified context restoration, removing redundant operations like
loading/storing floating-point registers.
Signed-off-by: Shell <smokewood@qq.com>
1. RT_FIELD_PREP: prepare a bitfield element.
2. RT_FIELD_GET: extract a bitfield element.
3. rt_offsetof: member offset of a struct
4. rt_upper_32_bits: high 32 bits of value.
5. rt_lower_32_bits: lower 32 bits of value.
6. rt_upper_16_bits: high 16 bits of value.
7. rt_lower_16_bits: lower 16 bits of value.
8. rt_max_t: fix type of max(...).
9. rt_ilog2: integer logarithm base 2.
Signed-off-by: GuEe-GUI <2991707448@qq.com>
feat: overall implementation of vector irq
This patch generalize the irq handling on up/mp system by adding the
`rt_hw_irq_exit()` & `rt_hw_vector_irq_sched()` API.
Changes:
- Added `rt_hw_irq_exit()` and `rt_hw_vector_irq_sched()` APIs for unified IRQ management.
- Refactored assembly code for both UP and MP systems to use the new IRQ handling flow.
- Removed redundant code and optimized exception handling paths.
Signed-off-by: Shell <smokewood@qq.com>
The rtdef.h is a big header with multiple dependency inside,
which makes it easier to introduce recursion dependency.
Signed-off-by: Shell <smokewood@qq.com>
This patch focuses on the ARM64 general context handling code.
The modifications are aimed at enhancing performance by simplifying
context save/restore operations.
Changes include:
- Adjusted stack alignment in `arch_set_thread_context` function.
- Updated `lwp_gcc.S` to reset frame pointer and link register.
- Refined `rt_hw_backtrace_frame_unwind` to handle user space address checks.
- Added `GET_THREAD_SELF` macro in `asm-generic.h`.
- Simplified context saving/restoring in `context_gcc.h` and related files.
- Optimized `rt_hw_context_switch_interrupt` and related assembly routines.
Signed-off-by: Shell <smokewood@qq.com>
This patch improves the efficiency and readability of the AArch64 common setup
code by calculating the `PV_OFFSET` once at the start and reusing the value.
This change reduces redundant calculations.
Signed-off-by: Shell <smokewood@qq.com>
[feat] Enhance support for backtrace service
rt_backtrace_formatted_print() and rt_backtrace_to_buffer() to help
debug routines.
Also, following modification are included:
- make rt_backtrace_frame patchable with weak attr
- replace lwp backtrace with sync output
Signed-off-by: Shell <smokewood@qq.com>
* [libcpu] arm64: Add hardware thread_self support
This patch introduces hardware-based thread self-identification
for the AArch64 architecture. It optimizes thread management by
using hardware registers to store and access the current thread's
pointer, reducing overhead and improving overall performance.
Changes include:
- Added `ARCH_USING_HW_THREAD_SELF` configuration option.
- Modified `rtdef.h`, `rtsched.h` to conditionally include
`critical_switch_flag` based on the new config.
- Updated context management in `context_gcc.S`, `cpuport.h`
to support hardware-based thread self.
- Enhanced `scheduler_mp.c` and `thread.c` to leverage the new
hardware thread self feature.
These modifications ensure better scheduling and thread handling,
particularly in multi-core environments, by minimizing the
software overhead associated with thread management.
Signed-off-by: Shell <smokewood@qq.com>
* fixup: address suggestion
* fixup: rt_current_thread as global
* scheduler: add cpu object for UP scheduler
Also, maintain the rt_current_thread in cpu object on UP scheduler.
---------
Signed-off-by: Shell <smokewood@qq.com>
* [libcpu/arm64] add C11 atomic ticket spinlock
Replace the former implementation of flag-based spinlock which is unfair
Besides, C11 atomic implementation is more readable (it's C anyway),
and maintainable. Cause toolchain can use their builtin optimization and
tune for different micro-architectures. For example armv8.5 introduces a
better instruction. The compiler can help with that when it knows your
target platform in support of it.
Signed-off-by: Shell <smokewood@qq.com>
* fixup: RT_CPUS_NR
---------
Signed-off-by: Shell <smokewood@qq.com>
* [ofw] dealing with mem region out of kernel space
- Fix parameter checking in _out_of_range() that NULL is excluded for
fixed mapping
- Split page install with a deferred stage to avoid mapping over
ARCH_EARLY_MAP_SIZE
Signed-off-by: Shell <smokewood@qq.com>
* fixup: restrict vstart for using of RT_NULL
---------
Signed-off-by: Shell <smokewood@qq.com>