skip to Main Content

Background
I was writing code that uses functions from ctype.h to identify things in strings. I accidentally passed the string (char*) to the function(s) which take and int type, causing the program to segfault. It was easy enough to see that I forgot to dereference the string pointer, but GCC gave me no warnings even when compiling with the following arguments:

gcc -o main main.c -Wall -Wextra -Werror -pedantic -pedantic-errors -std=c99 -Wconversion

I am using Debian GNU/Linux bookworm 12.5 x86_64 and gcc (Debian 12.2.0-14) 12.2.0 which is all up to date. Here’s a example of the problem:

/* main.c */
#include <ctype.h>
#include <stdio.h>

int main(void)
{
    char msg[] = "hello";
    int res = isspace(msg); // char* gets cast to int without warning
                            // It should be `isspace(*msg)`
                            // This also segfaults
    printf("%in", res);
    return 0;
}

Questions

  1. What warnings can I turn on to get compile time errors for these pointer-to-integer conversions?
  2. Why does this even segfault in the first place?

2

Answers


  1. You’re passing in a value that is outside the range of values the function expects. Doing so triggers undefined behavior, as per section 7.4p1 of the C standard regarding functions defined in ctype.h:

    The header <ctype.h> declares several functions useful for classifying
    and mapping characters. In all cases the argument is an int, the
    value of which shall be representable as an unsigned char or shall
    equal the value of the macro EOF. If the argument has any other value,
    the behavior is undefined

    And since this is undefined behavior, crashing is one possible outcome.

    As for why there’s no warning generated by the compiler, we need to look at the preprocessor output. The call to isspace gets converted to the following after the preprocessor:

    int res = ((*__ctype_b_loc ())[(int) ((msg))] & (unsigned short int) _ISspace);
    

    From this, we can see that isspace is implemented as a macro which uses a lookup table with the given argument as an index, and we can see that the argument is explicitly casted to int. This explicit cast explains why there’s no warning.

    The above also explains the crash, since a pointer value will likely be far out of the bounds of this lookup table and therefore attempt to access memory it doesn’t have access to.

    Login or Signup to reply.
  2. Likely, with your compiler, isspace() is implemented as a macro that includes a typecast of whatever argument it gets to char or int.

    Obviously, when the compiler sees a cast, it will just assume, "well, he said so". Macros are not type-checked at all (well, you can’t specify a type, so how should the compiler check it).

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search