zephyr/scripts/process_gperf.py
Andrew Boie 945af95f42 kernel: introduce object validation mechanism
All system calls made from userspace which involve pointers to kernel
objects (including device drivers) will need to have those pointers
validated; userspace should never be able to crash the kernel by passing
it garbage.

The actual validation with _k_object_validate() will be in the system
call receiver code, which doesn't exist yet.

- CONFIG_USERSPACE introduced. We are somewhat far away from having an
  end-to-end implementation, but at least need a Kconfig symbol to
  guard the incoming code with. Formal documentation doesn't exist yet
  either, but will appear later down the road once the implementation is
  mostly finalized.

- In the memory region for RAM, the data section has been moved last,
  past bss and noinit. This ensures that inserting generated tables
  with addresses of kernel objects does not change the addresses of
  those objects (which would make the table invalid)

- The DWARF debug information in the generated ELF binary is parsed to
  fetch the locations of all kernel objects and pass this to gperf to
  create a perfect hash table of their memory addresses.

- The generated gperf code doesn't know that we are exclusively working
  with memory addresses and uses memory inefficently. A post-processing
  script process_gperf.py adjusts the generated code before it is
  compiled to work with pointer values directly and not strings
  containing them.

- _k_object_init() calls inserted into the init functions for the set of
  kernel object types we are going to support so far

Issue: ZEP-2187
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
2017-09-07 16:33:33 -07:00

151 lines
4.3 KiB
Python
Executable File

#!/usr/bin/env python3
#
# Copyright (c) 2017 Intel Corporation
#
# SPDX-License-Identifier: Apache-2.0
import sys
import argparse
import os
import re
from distutils.version import LooseVersion
# --- debug stuff ---
"""
gperf C file post-processor
We use gperf to build up a perfect hashtable of pointer values. The way gperf
does this is to create a table 'wordlist' indexed by a string repreesentation
of a pointer address, and then doing memcmp() on a string passed in for
comparison
We are exclusively working with 4-byte pointer values. This script adjusts
the generated code so that we work with pointers directly and not strings.
This saves a considerable amount of space.
"""
def debug(text):
if not args.verbose:
return
sys.stdout.write(os.path.basename(sys.argv[0]) + ": " + text + "\n")
def error(text):
sys.stderr.write(os.path.basename(sys.argv[0]) + " ERROR: " + text + "\n")
sys.exit(1)
def warn(text):
sys.stdout.write(os.path.basename(sys.argv[0]) + " WARNING: " + text + "\n")
def reformat_str(match_obj):
addr_str = match_obj.group(0)
# Nip quotes
addr_str = addr_str[1:-1]
addr_vals = [0, 0, 0, 0]
ctr = 3
i = 0
while (True):
if i >= len(addr_str):
break
if addr_str[i] == "\\":
if addr_str[i+1].isdigit():
# Octal escape sequence
val_str = addr_str[i+1:i+4]
addr_vals[ctr] = int(val_str, 8)
i += 4
else:
# Char value that had to be escaped by C string rules
addr_vals[ctr] = ord(addr_str[i+1])
i += 2
else:
addr_vals[ctr] = ord(addr_str[i])
i += 1
ctr -= 1
return "(char *)0x%02x%02x%02x%02x" % tuple(addr_vals)
def process_line(line, fp):
if line.startswith("#"):
fp.write(line)
return
# Set the lookup function to static inline so it gets rolled into
# _k_object_find(), nothing else will use it
if re.search("struct _k_object [*]$", line):
fp.write("static inline " + line)
return
m = re.search("gperf version (.*) [*][/]$", line)
if m:
v = LooseVersion(m.groups()[0])
v_lo = LooseVersion("3.0")
v_hi = LooseVersion("3.1")
if (v < v_lo or v > v_hi):
warn("gperf %s is not tested, versions %s through %s supported" %
(v, v_lo, v_hi))
# Replace length lookups with constant len of 4 since we're always
# looking at pointers
line = re.sub(r'lengthtable[[]key[]]', r'4', line)
# Empty wordlist entries to have NULLs instead of ""
line = re.sub(r'[{]["]["][}]', r'{}', line)
# Suppress a compiler warning since this table is no longer necessary
line = re.sub(r'static unsigned char lengthtable',
r'static unsigned char __unused lengthtable', line)
# drop all use of register keyword, let compiler figure that out,
# we have to do this since we change stuff to take the address of some
# parameters
line = re.sub(r'register', r'', line)
# Hashing the address of the string
line = re.sub(r"hash [(]str, len[)]",
r"hash((const char *)&str, len)", line)
# Just compare pointers directly instead of using memcmp
if re.search("if [(][*]str", line):
fp.write(" if (str == s)\n")
return
# Take the strings with the binary information for the pointer values,
# and just turn them into pointers
line = re.sub(r'["].*["]', reformat_str, line)
fp.write(line)
def parse_args():
global args
parser = argparse.ArgumentParser(description = __doc__,
formatter_class = argparse.RawDescriptionHelpFormatter)
parser.add_argument("-i", "--input", required=True,
help="Input C file from gperf")
parser.add_argument("-o", "--output", required=True,
help="Output C file with processing done")
parser.add_argument("-v", "--verbose", action="store_true",
help="Print extra debugging information")
args = parser.parse_args()
def main():
parse_args()
with open(args.input, "r") as in_fp, open(args.output, "w") as out_fp:
for line in in_fp.readlines():
process_line(line, out_fp)
if __name__ == "__main__":
main()