Inside the idle loop, in some configuration, IRQ is unlocked and
then immediately locked again. There is a side effect:
1. IRQ is unlocked in middle of the loop.
2. Another thread (A) can now run so idle thread is un-scheduled.
3. Thread A runs to its end and going through the thread
self-abort path.
4. Idle thread is rescheduled again, and continues to run
the remaining loop when it eventuall calls k_cpu_idle().
The "pending abort" path is not being executed on thread A
at this point.
5. Now, thread A is suspended, and the CPU is in idle waiting
for interrupts (e.g. timeouts).
6. Thread B is waiting to join on thread A. Since thread A has
not been terminated yet so thread B is waiting until
the idle thread runs again and starts executing from
the beginning of while loop.
7. Depending on how many threads are running and how active
the platform is, idle thread may not run again for a while,
resulting in thread B appearing to be stuck.
To avoid this situation, the unlock/lock pair in middle of
the loop is removed so no rescheduling can be done mid-loop.
When there is no thread abort pending, it simply locks IRQ
and calls k_cpu_idle(). This is almost identical to the idle
loop before the thread abort code was introduced (except
the check for cpu->pending_abort).
Fixes#30573
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Cleanup code for power management and remove some duplication and
isolate power management code from the kernel code.
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
- Remove SYS_ prefix
- shorten POWER_MANAGEMENT to just PM
- DEVICE_POWER_MANAGEMENT -> PM_DEVICE
and use PM_ as the prefix for all PM related Kconfigs
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
Most of kernel files where declaring os module without providing
log level. Because of that default log level was used instead of
CONFIG_KERNEL_LOG_LEVEL.
Signed-off-by: Krzysztof Chruscinski <krzysztof.chruscinski@nordicsemi.no>
Fixes races where threads on another CPU are joining the
exiting thread, since it could still be running when
the joiners wake up on a different CPU.
Fixes problems where the thread object is still being
used by the kernel when the fn_abort() function is called,
preventing the thread object from being recycled or
freed back to a slab pool.
Fixes a race where a thread is aborted from one CPU while
it self-aborts on another CPU, that was currently worked
around with a busy-wait.
Precedent for doing this comes from FreeRTOS, which also
performs final thread cleanup in the idle thread.
Some logic in z_thread_single_abort() rearranged such that
when we release sched_spinlock, the thread object pointer
is never dereferenced by the kernel again; join waiters
or fn_abort() logic may free it immediately.
An assertion added to z_thread_single_abort() to ensure
it never gets called with thread == _current outside of an ISR.
Some logic has been added to ensure z_thread_single_abort()
tasks don't run more than once.
Fixes: #26486
Related to: #23063#23062
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Promote the private z_arch_* namespace, which specifies
the interface between the core kernel and the
architecture code, to a new top-level namespace named
arch_*.
This allows our documentation generation to create
online documentation for this set of interfaces,
and this set of interfaces is worth treating in a
more formal way anyway.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
This commit refactors kernel and arch headers to establish a boundary
between private and public interface headers.
The refactoring strategy used in this commit is detailed in the issue
This commit introduces the following major changes:
1. Establish a clear boundary between private and public headers by
removing "kernel/include" and "arch/*/include" from the global
include paths. Ideally, only kernel/ and arch/*/ source files should
reference the headers in these directories. If these headers must be
used by a component, these include paths shall be manually added to
the CMakeLists.txt file of the component. This is intended to
discourage applications from including private kernel and arch
headers either knowingly and unknowingly.
- kernel/include/ (PRIVATE)
This directory contains the private headers that provide private
kernel definitions which should not be visible outside the kernel
and arch source code. All public kernel definitions must be added
to an appropriate header located under include/.
- arch/*/include/ (PRIVATE)
This directory contains the private headers that provide private
architecture-specific definitions which should not be visible
outside the arch and kernel source code. All public architecture-
specific definitions must be added to an appropriate header located
under include/arch/*/.
- include/ AND include/sys/ (PUBLIC)
This directory contains the public headers that provide public
kernel definitions which can be referenced by both kernel and
application code.
- include/arch/*/ (PUBLIC)
This directory contains the public headers that provide public
architecture-specific definitions which can be referenced by both
kernel and application code.
2. Split arch_interface.h into "kernel-to-arch interface" and "public
arch interface" divisions.
- kernel/include/kernel_arch_interface.h
* provides private "kernel-to-arch interface" definition.
* includes arch/*/include/kernel_arch_func.h to ensure that the
interface function implementations are always available.
* includes sys/arch_interface.h so that public arch interface
definitions are automatically included when including this file.
- arch/*/include/kernel_arch_func.h
* provides architecture-specific "kernel-to-arch interface"
implementation.
* only the functions that will be used in kernel and arch source
files are defined here.
- include/sys/arch_interface.h
* provides "public arch interface" definition.
* includes include/arch/arch_inlines.h to ensure that the
architecture-specific public inline interface function
implementations are always available.
- include/arch/arch_inlines.h
* includes architecture-specific arch_inlines.h in
include/arch/*/arch_inline.h.
- include/arch/*/arch_inline.h
* provides architecture-specific "public arch interface" inline
function implementation.
* supersedes include/sys/arch_inline.h.
3. Refactor kernel and the existing architecture implementations.
- Remove circular dependency of kernel and arch headers. The
following general rules should be observed:
* Never include any private headers from public headers
* Never include kernel_internal.h in kernel_arch_data.h
* Always include kernel_arch_data.h from kernel_arch_func.h
* Never include kernel.h from kernel_struct.h either directly or
indirectly. Only add the kernel structures that must be referenced
from public arch headers in this file.
- Relocate syscall_handler.h to include/ so it can be used in the
public code. This is necessary because many user-mode public codes
reference the functions defined in this header.
- Relocate kernel_arch_thread.h to include/arch/*/thread.h. This is
necessary to provide architecture-specific thread definition for
'struct k_thread' in kernel.h.
- Remove any private header dependencies from public headers using
the following methods:
* If dependency is not required, simply omit
* If dependency is required,
- Relocate a portion of the required dependencies from the
private header to an appropriate public header OR
- Relocate the required private header to make it public.
This commit supersedes #20047, addresses #19666, and fixes#3056.
Signed-off-by: Stephanos Ioannidis <root@stephanos.io>
These are renamed to z_timestamp_main and z_timestamp_idle,
and now specified in kernel_internal.h.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
An #endif and the brace terminating a compound statement were
transposed, causing compilation errors with the above-specified
combination of configuration options.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
Corrected the define of SMP_FALLBACK to prevent llvm warning.
llvm issues a warning as the behaviour of using defined(x) inside a
macro expansion is undefined (https://reviews.llvm.org/D15866).
Signed-off-by: Jan Van Winkel <jan.van_winkel@dxplore.eu>
Now that we have a working IPI framework, there's no reason for the
default spin loop for the SMP idle thread. Just use the default
platform idle and send an IPI when a new thread is readied.
Long term, this can be optimized if necessary (e.g. only send the IPI
to idling CPUs, or check priorities, etc...), but for a 2-cpu system
this is a very reasonable default.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
The boot time measurement sample was giving bogus values on x86: an
assumption was made that the system timer is in sync with the CPU TSC,
which is not the case on most x86 boards.
Boot time measurements are no longer permitted unless the timer source
is the local APIC. To avoid issues of TSC scaling, the startup datum
has been forced to 0, which is in line with the ARM implementation
(which is the only other platform which supports this feature).
Cleanups along the way:
As the datum is now assumed zero, some variables are removed and
calculations simplified. The global variables involved in boot time
measurements are moved to the kernel.h header rather than being
redeclared in every place they are referenced. Since none of the
measurements actually use 64-bit precision, the samples are reduced
to 32-bit quantities.
In addition, this feature has been enabled in long mode.
Fixes: #19144
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
move power.h to power/power.h and
create a shim for backward-compatibility.
No functional changes to the headers.
A warning in the shim can be controlled with CONFIG_COMPAT_INCLUDES.
Related to #16539
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
Move internal and architecture specific headers from include/drivers to
subfolder for timer:
include/drivers/timer
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
This commit cleans up names of system power management functions by
assuring that:
- all functions start with 'sys_pm_' prefix
- API functions which should not be exposed to the user start with '_'
- name of the function hints at its purpose
Signed-off-by: Piotr Mienkowski <piotr.mienkowski@gmail.com>
There exists SoCs, e.g. STM32L4, where one of the low power modes
reduces CPU frequency and supply voltage but does not stop the CPU. Such
power modes are currently not supported by Zephyr.
To facilitate adding support for such class of power modes in the future
and to ensure the naming convention makes it clear that the currently
supported power modes stop the CPU this commit renames Low Power States
to Slep States and updates the documentation.
Signed-off-by: Piotr Mienkowski <piotr.mienkowski@gmail.com>
Update reserved function names starting with one underscore, replacing
them as follows:
'_k_' with 'z_'
'_K_' with 'Z_'
'_handler_' with 'z_handl_'
'_Cstart' with 'z_cstart'
'_Swap' with 'z_swap'
This renaming is done on both global and those static function names
in kernel/include and include/. Other static function names in kernel/
are renamed by removing the leading underscore. Other function names
not starting with any prefix listed above are renamed starting with
a 'z_' or 'Z_' prefix.
Function names starting with two or three leading underscores are not
automatcally renamed since these names will collide with the variants
with two or three leading underscores.
Various generator scripts have also been updated as well as perf,
linker and usb files. These are
drivers/serial/uart_handlers.c
include/linker/kobject-text.ld
kernel/include/syscall_handler.h
scripts/gen_kobject_list.py
scripts/gen_syscall_header.py
Signed-off-by: Patrik Flykt <patrik.flykt@intel.com>
This commit changes the names of SYS_POWER_DEEP_SLEEP* Kconfig
options in order to match SYS_POWER_LOW_POWER_STATE* naming
scheme.
Signed-off-by: Piotr Zięcik <piotr.ziecik@nordicsemi.no>
The SYS_POWER_LOW_POWER_STATE_SUPPORTED and SYS_POWER_LOW_POWER_STATE
suggests one low power state but these options control multiple
low power state. This commit uses plural in the names to indicate
that.
Signed-off-by: Piotr Zięcik <piotr.ziecik@nordicsemi.no>
The power management framework used two different abstractions
to describe power states. The SYS_PM_* given coarse information
what kind of power state (low power or deep sleep) was used,
while the SYS_POWER_STATE_* abstraction provided information
about particular power mode.
This commit removes the SYS_PM_* abstraction as the same
information is already carried in SYS_POWER_STATE_*.
Signed-off-by: Piotr Zięcik <piotr.ziecik@nordicsemi.no>
System must not set the clock expiry via backdoor as it may
effect in unbound time drift of all scheduled timeouts.
Fixes: #11502
Signed-off-by: Pawel Dunaj <pawel.dunaj@nordicsemi.no>
This got broken. Add some #ifery to handle the case. Not clean, will
clean up in a future pass once the API is final.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
If the idle code was detecting that it needed to sleep for less than
CONFIG_SYS_TICKLESS_IDLE_THRESH, then it would never call
z_clock_set_timeout() at all, which means that the system would never
wake up unless it already had a timeout scheduled! Apparently we
lacked a test case to detect this condition.
Honestly this seems like a crazy feature to me. There's no benefit in
delivering needless tick announcements. If the system has the
capacity to enter deeper sleep for long timeouts, that's already
exposed via the PM APIs, the timer subsystem needn't be involved.
But... we actually have a test (tickless_concept) that looks at this,
so support it for now and consider deprecation later.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
This code (just refactored as part of the timer API work) turns out to
be needless. It's trying to detect the case where we're being asked
to idle for zero time, but that's not possible with a properly
functioning timer driver: the call to z_clock_announce() must happen
out of an interrupt, and this is the idle thread, which must sit below
any possible interrupt priority. The call to z_clock_uptime() must
not ever return "too late" until after the timer interrupt has fired,
at which point we'll be inspecting the next timeout (which itself is
guaranteed to be in the future for the same reason).
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
The tickless driver had a bunch of "hairy" APIs which forced the timer
drivers to do needless low-level accounting for the benefit of the
kernel, all of which then proceeded to implement them via cut and
paste. Specifically the "program_time" calls forced the driver to
expose to the kernel exactly when the next interrupt was due and how
much time had elapsed, in a parallel API to the existing "what time is
it" and "announce a tick" interrupts that carry the same information.
Remove these from the kernel, replacing them with synthesized logic
written in terms of the simpler APIs.
In some cases there will be a performance impact due to the use of the
64 bit uptime call, but that will go away soon.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Rename timer driver API functions to be consistent. ADD DOCS TO THE
HEADER so implementations understand what the requirements are.
Remove some unused functions that don't need declarations here.
Also removes the per-platform #if's around the power control callback
in favor of a weak-linked noop function in the driver initialization
(adds a few bytes of code to default platforms -- we'll live, I
think).
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
The existing API had two almost identical functions: _set_time() and
_timer_idle_enter(). Both simply instruct the timer driver to set the
next timer interrupt expiration appropriately so that the call to
z_clock_announce() will be made at the requested number of ticks. On
most/all hardware, these should be implementable identically.
Unfortunately because they are specified differently, existing drivers
have implemented them in parallel.
Specify a new, unified, z_clock_set_timeout(). Document it clearly
for implementors. And provide a shim layer for legacy drivers that
will continue to use the old functions.
Note that this patch fixes an existing bug found by inspection: the
old call to _set_time() out of z_clock_announce() failed to test for
the "wait forever" case in the situation where clock_always_on is
true, meaning that a system that reached this point and then never set
another timeout would freeze its uptime clock incorrectly.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
MISRA C requires that every controlling expression of and if or while
statement have a boolean type.
Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
Define _sys_soc_resume() only if CONFIG_SYS_POWER_LOW_POWER_STATE
is enabled.
Define _sys_soc_resume_from_deep_sleep() only if
CONFIG_SYS_POWER_DEEP_SLEEP is enabled.
Signed-off-by: Ramakrishna Pallala <ramakrishna.pallala@intel.com>
This was wrong in two ways, one subtle and one awful.
The subtle problem was that the IRQ lock isn't actually globally
recursive, it gets reset when you context switch (i.e. a _Swap()
implicitly releases and reacquires it). So the recursive count I was
keeping needs to be per-thread or else we risk deadlock any time we
swap away from a thread holding the lock.
And because part of my brain apparently knew this, there was an
"optimization" in the code that tested the current count vs. zero
outside the lock, on the argument that if it was non-zero we must
already hold the lock. Which would be true of a per-thread counter,
but NOT a global one: the other CPU may be holding that lock, and this
test will tell you *you* do. The upshot is that a recursive
irq_lock() would almost always SUCCEED INCORRECTLY when there was lock
contention. That this didn't break more things is amazing to me.
The rework is actually simpler than the original, thankfully. Though
there are some further subtleties:
* The lock state implied by irq_lock() allows the lock to be
implicitly released on context switch (i.e. you can _Swap() with the
lock held at a recursion level higher than 1, which needs to allow
other processes to run). So return paths into threads from _Swap()
and interrupt/exception exit need to check and restore the global
lock state, spinning as needed.
* The idle loop design specifies a k_cpu_idle() function that is on
common architectures expected to enable interrupts (for obvious
reasons), but there is no place to put non-arch code to wire it into
the global lock accounting. So on SMP, even CPU0 needs to use the
"dumb" spinning idle loop.
Finally this patch contains a simple bugfix too, found by inspection:
the interrupt return code used when CONFIG_SWITCH is enabled wasn't
correctly setting the active flag on the threads, opening up the
potential for a race that might result in a thread being scheduled on
two CPUs simultaneously.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Names that begin with an underscore are reserved by the C standard.
This patch does not change names of functions defined and implemented
in header files.
Signed-off-by: Leandro Pereira <leandro.pereira@intel.com>
A pure timer-based idle won't work well in SMP. Without an IPI to
wake up idle CPUs out of the scheduler they will sleep far too long
and the main CPU will do all the scheduling of wake-up-and-sleep
processes. Instead just have the auxilary CPUs do a traditional
busy-wait scheduler in their idle loop.
We will need to revisit an architecture that allows both
wait-for-timer-interrupt idle and SMP.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
1. Changed _tsc_read() to k_cycles_get_32(). Thus reading the
time stamp will be agnostic of the architecutre used.
2. Changed the variable names from *_tsc to *_time_stamp.
JIRA: ZEP-1426
Signed-off-by: Adithya Baglody <adithya.nagaraj.baglody@intel.com>
Adds event based scheduling logic to the kernel. Updates
management of timeouts, timers, idling etc. based on
time tracked at events rather than periodic ticks. Provides
interfaces for timers to announce and get next timer expiry
based on kernel scheduling decisions involving time slicing
of threads, timeouts and idling. Uses wall time units instead
of ticks in all scheduling activities.
The implementation involves changes in the following areas
1. Management of time in wall units like ms/us instead of ticks
The existing implementation already had an option to configure
number of ticks in a second. The new implementation builds on
top of that feature and provides option to set the size of the
scheduling granurality to mili seconds or micro seconds. This
allows most of the current implementation to be reused. Due to
this re-use and co-existence with tick based kernel, the names
of variables may contain the word "tick". However, in the
tickless kernel implementation, it represents the currently
configured time unit, which would be be mili seconds or
micro seconds. The APIs that take time as a parameter are not
impacted and they continue to pass time in mili seconds.
2. Timers would not be programmed in periodic mode
generating ticks. Instead they would be programmed in one
shot mode to generate events at the time the kernel scheduler
needs to gain control for its scheduling activities like
timers, timeouts, time slicing, idling etc.
3. The scheduler provides interfaces that the timer drivers
use to announce elapsed time and get the next time the scheduler
needs a timer event. It is possible that the scheduler may not
need another timer event, in which case the system would wait
for a non-timer event to wake it up if it is idling.
4. New APIs are defined to be implemented by timer drivers. Also
they need to handler timer events differently. These changes
have been done in the HPET timer driver. In future other timers
that support tickles kernel should implement these APIs as well.
These APIs are to re-program the timer, update and announce
elapsed time.
5. Philosopher and timer_api applications have been enabled to
test tickless kernel. Separate configuration files are created
which define the necessary CONFIG flags. Run these apps using
following command
make pristine && make BOARD=qemu_x86 CONF_FILE=prj_tickless.conf qemu
Jira: ZEP-339 ZEP-1946 ZEP-948
Change-Id: I7d950c31bf1ff929a9066fad42c2f0559a2e5983
Signed-off-by: Ramesh Thomas <ramesh.thomas@intel.com>
Convert code to use u{8,16,32,64}_t and s{8,16,32,64}_t instead of C99
integer types. This handles the remaining includes and kernel, plus
touching up various points that we skipped because of include
dependancies. We also convert the PRI printf formatters in the arch
code over to normal formatters.
Jira: ZEP-2051
Change-Id: Iecbb12601a3ee4ea936fd7ddea37788a645b08b0
Signed-off-by: Kumar Gala <kumar.gala@linaro.org>
Replace the existing Apache 2.0 boilerplate header with an SPDX tag
throughout the zephyr code tree. This patch was generated via a
script run over the master branch.
Also updated doc/porting/application.rst that had a dependency on
line numbers in a literal include.
Manually updated subsys/logging/sys_log.c that had a malformed
header in the original file. Also cleanup several cases that already
had a SPDX tag and we either got a duplicate or missed updating.
Jira: ZEP-1457
Change-Id: I6131a1d4ee0e58f5b938300c2d2fc77d2e69572c
Signed-off-by: David B. Kinder <david.b.kinder@intel.com>
Signed-off-by: Kumar Gala <kumar.gala@linaro.org>
Also remove mentions of unified kernel in various places in the kernel,
samples and documentation.
Change-Id: Ice43bc73badbe7e14bae40fd6f2a302f6528a77d
Signed-off-by: Anas Nashif <anas.nashif@intel.com>