ashleydixon
Badges: 9
Rep:
?
#1
Report Thread starter 2 years ago
#1
Not sure if this is the right forum to post this in, since it's mainly chat and not help, however it seemed a bit too specialist to post in the general chat forum. Anyway, I was writing some C code the other day and came across this really trivial bug and noticed it as soon as it caused an error, however everyone I've shown it to (three adults with degrees in computer science, my A-level computer science teacher, and my computer science class), don't seem to understand it/see where the bug is.

Just interested if it's a really obscure bug and I was just lucky to catch it or if everyone else just was unlucky and missed it. Wanted to see what the TSR community had to say

It's an implementation of an absolute function.
Code:
int naive_abs ( int n ) {
    return ( n < 0 ) ? -n : n;
}
0
reply
winterscoming
Badges: 19
Rep:
?
#2
Report 2 years ago
#2
In my opinion It's arguable whether that's really a bug with the function itself because integer representation is a platform-specific issue, and integer range is a documented limitation of the C programming language (signed integer overflow is 'undefined behaviour' in C and the language itself is agnostic to the underlying platform/architecture). The reality is that any code using large integers for anything needs to be aware of what happens around INT_MIN and INT_MAX.

It's also worth pointing out that the built-in abs(int) function from <stdlib.h> behaves in exactly the same way, and the limitation is documented: https://en.cppreference.com/w/c/numeric/math/abs

Of course, if you wanted to "fix" it you could just change the result type to an unsigned int, but that means other equally subtle implications for the code which uses it needing to deal with a mix of different integer types. Conversions from signed to unsigned are generally also susceptible to the same problem elsewhere in the code, and the programmer using the function still needs to be aware of that when using large-magnitude integers.

So one possible alternative, while still naive/problematic could be
Code:
unsigned naive_abs ( int n ) {
    return ( n < 0 ) ? 0u - n : 0u + n;
}
In my opinion, this doesn't really solve anything in any kind of useful way though, it just creates a different kind of potential problem elsewhere by possibly forcing the code which uses it to mix signed/unsigned together. (I mean, it doesn't change the fact that sizeof(int) == sizeof(unsigned int), nor that twos-complement issues are platform-specific)


From a defensive and platform-agnostic point of view, I'd think a 'correct' solution which doesn't impose a different data type would need to have some way of performing a check in a twos-complement architecture, and returning an error, so I would consider something like this instead, since this would force the code which uses it to think about how to handle the int range boundary for twos-complement systems:
Code:
int safe_abs(int n, int* out) {
  if (out == 0) return -1;
  
#if (-1 == ~0) /* detect twos complement architecture */
  if (n == INT_MIN) return -1;
#endif

  *out = (n < 0) ? -n : n;
  return 0;
}

int main(void) {
  int result;
  if (safe_abs(INT_MIN, &result) == 0)
    printf("result = %d\n",result);
  else 
    printf("safeabs error");
  return 0;
}
The downside is that it just adds complexity for 99% of cases where signed integer overflow is never an issue.
Last edited by winterscoming; 2 years ago
0
reply
ashleydixon
Badges: 9
Rep:
?
#3
Report Thread starter 2 years ago
#3
(Original post by winterscoming)
In my opinion It's arguable whether that's really a bug with the function itself because integer representation is a platform-specific issue, and integer range is a documented limitation of the C programming language (signed integer overflow is 'undefined behaviour' in C and the language itself is agnostic to the underlying platform/architecture). The reality is that any code using large integers for anything needs to be aware of what happens around INT_MIN and INT_MAX.

It's also worth pointing out that the built-in abs(int) function from <stdlib.h> behaves in exactly the same way, and the limitation is documented: https://en.cppreference.com/w/c/numeric/math/abs

Of course, if you wanted to "fix" it you could just change the result type to an unsigned int, but that means other equally subtle implications for the code which uses it needing to deal with a mix of different integer types. Conversions from signed to unsigned are generally also susceptible to the same problem elsewhere in the code, and the programmer using the function still needs to be aware of that when using large-magnitude integers.

So one possible alternative, while still naive/problematic could be
Code:
unsigned naive_abs ( int n ) {
    return ( n < 0 ) ? 0u - n : 0u + n;
}
In my opinion, this doesn't really solve anything in any kind of useful way though, it just creates a different kind of potential problem elsewhere by possibly forcing the code which uses it to mix signed/unsigned together. (I mean, it doesn't change the fact that sizeof(int) == sizeof(unsigned int), nor that twos-complement issues are platform-specific)


From a defensive and platform-agnostic point of view, I'd think a 'correct' solution which doesn't impose a different data type would need to have some way of performing a check in a twos-complement architecture, and returning an error, so I would consider something like this instead, since this would force the code which uses it to think about how to handle the int range boundary for twos-complement systems:
Code:
int safe_abs(int n, int* out) {
  if (out == 0) return -1;
  
#if (-1 == ~0) /* detect twos complement architecture */
  if (n == INT_MIN) return -1;
#endif

  *out = (n < 0) ? -n : n;
  return 0;
}

int main(void) {
  int result;
  if (safe_abs(INT_MIN, &result) == 0)
    printf("result = %d\n",result);
  else 
    printf("safeabs error");
  return 0;
}
The downside is that it just adds complexity for 99% of cases where signed integer overflow is never an issue.
Yes, sorry, maybe the word bug was a bit too strong - undefined behaviour or "flawed function" sounds nicer. I was using it as part of a larger program (10 000 lines+) where it caused more than a few bugs ! However, there's no need to add the complexity of using an output alongside the standard return value, as your function returns a signed integer anyway, it could just return -1 on error and the unsigned value on success.

Code:
#include <stdio.h>
#include <limits.h>

int safe_abs ( int n ) {
#if (-1 == ~0) /* detect twos complement architecture */
    if (n == INT_MIN) return -1;
#endif

    return ( n < 0 ) ? -n : n;
}

int main ( ) {
    int value = -1;

    if ( ( value = safe_abs ( INT_MIN ) ) == -1 ) {
        fprintf ( stderr, "error: safe_abs error, minimum integer was passed\n" )
    } else {
        printf ( "value = %d\n", value );
    }

    return 0;
}
Last edited by ashleydixon; 2 years ago
0
reply
winterscoming
Badges: 19
Rep:
?
#4
Report 2 years ago
#4
(Original post by ashleydixon)
Yes, sorry, maybe the word bug was a bit too strong - undefined behaviour or "flawed function" sounds nicer. I was using it as part of a larger program (10 000 lines+) where it caused more than a few bugs ! However, there's no need to add the complexity of using an output alongside the standard return value, as your function returns a signed integer anyway, it could just return -1 on error and the unsigned value on success.

Code:
#include <stdio.h>
#include <limits.h>

int safe_abs ( int n ) {
#if (-1 == ~0) /* detect twos complement architecture */
    if (n == INT_MIN) return -1;
#endif

    return ( n < 0 ) ? -n : n;
}

int main ( ) {
    int value = -1;

    if ( ( value = safe_abs ( INT_MIN ) ) == -1 ) {
        fprintf ( stderr, "error: safe_abs error, minimum integer was passed\n" )
    } else {
        printf ( "value = %d\n", value );
    }

    return 0;
}
The point in splitting the return value away from the result is to make it clear to the programmer that the function can fail by encouraging them to handle it (it's more explicit that way), and also prevent the output value from ever being negative - the only problem with combining the result and the error code like this is that it's just as opaque to programmers as the original version of the code, so if the problem to be solved is that returning INT_MAX results in an obscure bug, then returning -1 isn't going to be any less obscure.
Last edited by winterscoming; 2 years ago
0
reply
ashleydixon
Badges: 9
Rep:
?
#5
Report Thread starter 2 years ago
#5
(Original post by winterscoming)
The point in splitting the return value away from the result is to make it clear to the programmer that the function can fail by encouraging them to handle it (it's more explicit that way), and also prevent the output value from ever being negative - the only problem with combining the result and the error code like this is that it's just as opaque to programmers as the original version of the code, so if the problem to be solved is that returning INT_MAX results in an obscure bug, then returning -1 isn't going to be any less obscure.
Yes, you're completely right -- I probably shouldn't post on TSR ten minutes after waking up lol
Anyway, has been nice to talk to you about this
0
reply
X

Quick Reply

Attached files
Write a reply...
Reply
new posts
Back
to top
Latest
My Feed

See more of what you like on
The Student Room

You can personalise what you see on TSR. Tell us a little about yourself to get started.

Personalise

Feeling behind at school/college? What is the best thing your teachers could to help you catch up?

Extra compulsory independent learning activities (eg, homework tasks) (12)
7.36%
Run extra compulsory lessons or workshops (27)
16.56%
Focus on making the normal lesson time with them as high quality as possible (26)
15.95%
Focus on making the normal learning resources as high quality/accessible as possible (24)
14.72%
Provide extra optional activities, lessons and/or workshops (45)
27.61%
Assess students, decide who needs extra support and focus on these students (29)
17.79%

Watched Threads

View All