While documenting the float conversion code, I found there was room
for some optimization. In doing so I added test cases to cover edge
cases e.g. making sure proper rounding is applied and that no loss
of precision was introduced. Compiled code should be smaller and
faster.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Add an optimized realloc() implementation that can successfully expand
allocations in place if there exists enough free memory after the
supplied block.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
The biggest required padding is equal to `align - chunk_header_bytes`
and not `align - 1` given that the header already contributes to the
padding.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
LLVM building for qemu_x86 appears to have an optimization bug where a
union that is assigned to hold values read from va_args() is inferred
to be a constant value, so is placed in ROM with an all-zero content.
Prevent this by packing the conversion state and the value union into
a single container structure that's stack allocated.
Signed-off-by: Peter Bigot <peter.bigot@nordicsemi.no>
Although flags with pointers are not defined behavior, there is a
desire to have them work, so add a test and fix the complete
implementation so it passes.
Signed-off-by: Peter Bigot <peter.bigot@nordicsemi.no>
Simplify the code to increase readability, and fix right-padding
for %p.
Also, the compiled code is smaller with those changes applied.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
This fixes an issue where the %p specifier always generated "(nil)"
on SPARC. The failing test cases were:
tests/lib/sprintf/libraries.libc.sprintf
tests/kernel/common/kernel.common.misra
tests/kernel/common/kernel.common.tls
tests/kernel/common/kernel.common
The exact logic behind the issue has not been fully analyzed, but
it can be observed that this commit eliminates one occurrence of
undefined behavior. (Only allowed to read the last union field written.)
Signed-off-by: Martin Åberg <martin.aberg@gaisler.com>
Using the same implementation as the rest of Zephyr reduces code size.
Update options and expected results for formatting test.
Signed-off-by: Peter Bigot <peter.bigot@nordicsemi.no>
This commit adds a C99 stdio value formatter capability where
generated text is emitted through a callback. This allows generation
of arbitrarily long output without a buffer, functionality that is
core to printk, logging, and other system and application needs.
The formatter supports most C99 specifications, excluding:
* %Lf long double conversion
* wide character output
Kconfig options allow disabling features like floating-point
conversion if they are not necessary. By default most conversions are
enabled.
The original z_vprintk() implementation is adapted to meet the
interface requirements of cbvprintf, and made available as an opt-in
feature for space-constrained applications that do not need full
formatting support.
Signed-off-by: Peter Bigot <peter.bigot@nordicsemi.no>
This reverts commit e812ee6c21.
This is the initial step towards replacing the core Zephyr formatting
infrastructure with a common functionally-complete solution.
Signed-off-by: Peter Bigot <peter.bigot@nordicsemi.no>
Previously, ring buffer had capacity of provided buffer size - 1. This
trick was used to distinguish between empty and full states. It had one
drawback: ring buffer could not be used as a pool of equal sized buffers
(using ring_buf_put_claim and ring_buf_get_claim).
Reworked internals to use non wrapping head and tail. Since they are
non wrapping, there is no issue with distinguishing between empty and
full. Since this appraoch would be vulnerable to wrapping on 32 bit
boundary, added a mechanism which periodically reduces all indexes to
avoid 32 bit wrapping.
After this rework, buffer has one byte more capacity. Simple test shows
slight performance improvement.
Updated tests to reflect increased capacity and added test to check if
it is possible to continuesly allocated 2 buffers of half ring buffer
size.
Signed-off-by: Krzysztof Chruscinski <krzysztof.chruscinski@nordicsemi.no>
The compiler doesn't need help here.
For example, gcc creates this on Aarch64:
_ldiv5:
ldr x1, [x0]
mov x2, -3689348814741910324
movk x2, 0xcccd, lsl 0
add x1, x1, 2
umulh x1, x1, x2
lsr x1, x1, 2
str x1, [x0]
ret
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
The code that made aligned_alloc work with the 4-byte heap headers was
requesting a block of the correctly padded size, and correctly
aligning the output buffer within that memory, but it was using the
UNALIGNED chunk size for the buffer as the final size of the block
with splitting off the unused suffix. So the final chunk in the
buffer was could be incorrectly returned to the heap and reused,
leading to overlap.
Compute the chunk size of the output buffer based on the
already-aligned output pointer instead.
Initial investigation and fix from Andy Ross <andrew.j.ross@intel.com>.
I reworked his fix, created a test case, and stolen his commit log.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Replace all calls to the assert macro that comes from libc by calls to
__ASSERT_NO_MSG(). This is usefull as the former might be different
depending on the libc used and the later can be customized to reduce
flash footprint.
Signed-off-by: Xavier Chapron <xavier.chapron@stimio.fr>
Both operands of an operator in the arithmetic conversions
performed shall have the same essential type category.
Changes are related to converting the integer constants to the
unsigned integer constants
Signed-off-by: Aastha Grover <aastha.grover@intel.com>
Multiple calls of z_free_fd against fd with refcount equal 0 are causing
descriptor table entry leak by decrementing refcount below 0.
This patch prevents decrementing refcount below zero.
Signed-off-by: Grzegorz Kostka <grzegorz@mobility.cloud>
The vararg extraction for unmodified integers always used int, which
sign extends when assigned to the printk_val_t. Avoid the sign
extension for unsigned values.
Signed-off-by: Peter Bigot <peter.bigot@nordicsemi.no>
Character class functions from ctype.h may be implemented as macros
where the argument is used to index an array of class flags. Using a
char value as an index produces diagnostics in some toolchains.
Explicitly cast the parameter to the type required by the API.
Signed-off-by: Peter Bigot <peter.bigot@nordicsemi.no>
shell_fprintf requires that formatted output be emitted with a
putchar()-like output function. Newlib does not provide such a
capability. Zephyr provides two solutions: z_prf() which is part of
minimal libc and handles floating point formatting, and z_vprintk()
which is core and does not support floating point.
Move z_prf() out of minimal libc into the core lib area, and use it
unconditionally in the shell.
Signed-off-by: Peter Bigot <peter.bigot@nordicsemi.no>
The new fd entry should be reserved by incrementing its reference count
in z_reserve_fd() instead of z_finalize_fd() in order to avoid having
the same one being returned in a concurrent call. If for some reason
the fd is not finalized after z_reserve_fd() is called, it can be
freed via z_free_fd(), which would decrement the reference count.
Fixes#27721
Signed-off-by: Vincent Wan <vwan@ti.com>
-Wimplicit-fallthrough=2 requires a fallthrough comment or a compiler
to tells gcc that this happens intentionally.
Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
Do not route close() calls via ioctl() as that is error prone
and quite pointless. Instead create a callback for close() in
fdtable and use it directly.
Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
The on-off manager infrastructure is designed to robust asynchronous
transition between binary states where multiple clients may be
initiating a transition from any context. The actual transition is
performed using a manager that tracks the current state and pending
operations. Requests are initiated by passing a reference to an
onoff_client object that holds client state including the notification
mechanism.
This API may be used in subsystems where the transitions for a
particular driver are always synchronous and isr-ok, e.g. setting a
SoC-controlled GPIO. In this situation the full on-off manager
infrastructure is wasteful. All we need is a record of the service
state: off, active count, or error.
Add a data structure and an API that can be used to replace the onoff
manager functionality in a situation where all transitions are isr-ok
and synchronous while retaining compatible behavior from the client
perspective.
Signed-off-by: Peter A. Bigot <pab@pabigot.com>
Use proper refcounting instead of magic value in obj field
when checking whether the fd is still in use. This will make
sure that if fd is shared between two threads, we do not
release it too soon.
Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
This set of functions seem to be there just because of historical
reasons, stemming from Kbuild. They are non-obvious and prone to errors,
so remove them in favor of the `_ifdef()` ones with an explicit
`CONFIG_` condition.
Script used:
git grep -l _if_kconfig | xargs sed -E -i
"s/_if_kconfig\(\s*(\w*)/_ifdef(CONFIG_\U\1\E \1/g"
Signed-off-by: Carles Cufi <carles.cufi@nordicsemi.no>
Given socket offloading is now implemented under the fd's vtable, we can
directly use the default fcntl implementation.
Signed-off-by: Vincent Wan <vwan@ti.com>
MISRA-C Rule 5.3 states that identifiers in inner scope should
not hide identifiers in outer scope.
In the function sys_heap_alloc(), the variable "chunksz"
collide with function named chunksz(). So rename those variable.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Just as NULL pointers should not be dereferenced, they should
not be called either.
Fixes 26723
Signed-off-by: Pete Skeggs <peter.skeggs@nordicsemi.no>
After commit 8a6b02b5bf ("lib/os/heap: some code simplification in
sys_heap_aligned_alloc()") it is no longer required to have a "big"
heap for aligned allocations to work on 32-bit targets. While the
natural alignment for returned memory has an offset of 4 within a chunk
unit due to the smaller header size, returning to a chunkid from a
memory pointer with an offset of 8 will fall back onto the proper chunk
number once the 4 is substracted and then divided by 8.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
The code is doing a split in split_alloc(), adding the leftover to the
free list, then splitting the suffix away in sys_heap_aligned_alloc(),
removing the former leftover from the free list, combining it with the
suffix and finally adding the combined chunk back to the free list.
Instead, let's have each allocator do their own splitting only once by
moving the split_alloc() processing upstream rather than downstream.
This also allows for the "used" flag to be set only once at the end
rather than being overwritten along the way.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Instead of limiting the excess split-off to sufficiently large chunks
in split_alloc(), let's allow normal allocations to create "solo free
headers" just like with aligned allocations. There is no point leaving
them in the allocated chunk if the user didn't ask for it. Doing so
makes them eligible for merging at the next opportunity and potentially
reusable sooner.
Also make the validation code aware of them.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
One fundamental validation criteria is to never have consecutive free
chunks. If that ever happens we failed to merge them. That means a free
chunk must always be surrounded by used chunks.
It is a pain to extend valid_chunk() with new rules as it is.
So a VALIDATE() macro is introduced to make things easier to work with.
It also allows for isolating each test, possibly making VALIDATE() into
__ASSERT() to determine exactly which test is tripping when debugging.
Finally, because of that new validation rule, sys_heap_validate() must
be modified so not to use valid_chunk() while it is flipping all the
"used" flags. So let's run valid_chunk() up front before alterating
chunk headers.
Now sys_heap_validate() has become justifiably more expensive and a few
emulated targets are about to bust the tests/lib/heap test timeout. So
bump the timeout as well.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
This makes the code cleaner wrt bucket_idx() usage on chunks for which
solo_free_header() is true. In such case the bucket_idx() computation
is useless, and potentially undefined anyway.
In the same vain, move the clearing of the used flag out of
free_chunks() as only one of its callers actually needs that.
Makes free_chunks singular as there is only one chunk (potentially
spanning multiple chunk units) to free.
Also some cosmetic changes for better code uniformity.
No functional changes.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Currently printk isn't synchronized except at the byte output level,
leading to interleaving of messages on SMP systems that try to log
simultaneously. This is actually fairly amusing, and actually helpful
occasionally to validate inter-CPU contention down to the "few cycles"
level.
Still, when you're printing data you need to read, you need to be able
to read it. Put a spinlock around each buffered line. This has to
happen in a few places, as there are three different code paths taken
for !USERSPACE, syscall, and user mode.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
The width for %p on 32-bit targets should be 8 regardless of
CONFIG_PRINTK64. Adjust the test accordingly.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Some checks in sys_heap_init() depend on the externally provided size
parameter. If the check fails, this would be a bug outside of the heap
code and therefore should be flagged despite the value of
CONFIG_SYS_HEAP_VALIDATE.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Add support for 64 bit conversions in a uniformly expressable way by
printing values backwards into a buffer on the stack first. This
allows all operations to work on the low bits of the value and so the
code doesn't need to care (beyond the size of that buffer) about the
word size. This trick also doesn't care about the specifics of the
base value, so in the process this unifies the decimal and hex printk
conversion code to a single function.
This comes at a mild cost in CPU cycles to the decimal converter and
somewhat higher cost to hex (because it's now doing a full div/mod
operation instead of shifting and masking). And stack usage has grown
by a few words to hold the temporary. But the benefits in code size
are substantial (e.g. ~250 bytes of .text on arm32).
Note that this also contains a change to tests/kernel/common to
address what appears to have been a bug in the original converters.
The printk test uses a format string that looks like "%-4x%-2p" and
feeds it the literal arguments "0xABCDEF" and "(char *)42".
Now... clearly both those results are going to overflow the 4 and
2-byte field sizes, so there shouldn't be any whitespace between these
fields. But the test was written to expect two spaces, inexplicably
(yes, I checked: POSIX-compatible printf implementations don't have
those spaces either).
The new code is definitely doing the right thing, so fix the test
instead.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Add support for a C11-style aligned_alloc() in the heap
implementation. This is properly optimized, in the sense that unused
prefix/suffix data around the chosen allocation is returned to the
heap and made available for general allocation.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Miscellaneous refactoring and simplification. No behavioral changes:
Make split_alloc() take and return chunk IDs and not memory pointers,
leaving the conversion between memory/chunks the job of the higher
level sys_heap_alloc() API. This cleans up the internals for code
that wants to do allocation but has its own ideas about what to do
with the resulting chunks.
Add split_chunks() and merge_chunks() utilities to own the linear/size
pointers and have split_alloc() and free_chunks() use them instead of
doing the list management directly.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
This struct is taking up most of the heap's constant footprint overhead.
We can easily get rid of the list_size member as it is mostly used to
determine if the list is empty, and that can be determined through
other means.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Make the LEFT_SIZE field first and SIZE_AND_USED field last (for an
allocated chunk) so they sit right next to the allocated memory. The
current chunk's SIZE_AND_USED field points to the next (right) chunk,
and from there the LEFT_SIZE field should point back to the current
chunk. Many trivial memory overflows should trip that test.
One way to make this test more robust could involve xor'ing the values
within respective accessor pairs. But at least the fact that the size
value is shifted by one bit already prevent fooling the test with a
same-byte corruption.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>