worst bugs ever (or your most hated)

A place to discuss the implementation and style of computer programs.

Moderators: phlip, Moderators General, Prelates

User avatar
sparkyb
Posts: 1091
Joined: Thu Sep 06, 2007 7:30 pm UTC
Location: Camberville proper!
Contact:

Re: worst bugs ever (or your most hated)

Postby sparkyb » Fri Dec 12, 2008 5:40 pm UTC

Some of this stuff might only belong in the "It Doesn't Work" thread. If these are really your most annoying bugs ever then you've been pretty lucky.

My most hated bugs are the ones in a category I call "mystery bugs", where even the debugging steps I try don't work as they should and everything seems to defy logic. There were 2 or 3 of these situations on 1 particular project.

The first was actually a hardware/drivers kind of thing. We were doing this large installation with 5 identical networked setups. 5 identical machines were in the rack in the control room and the VGA was run, via some kind of repeaters, to the projectors out front. The interaction was camera based, so we had a pair of industrial USB machine vision cameras per screen run back to the control room over fiber optic repeaters. These were high end USB2.0 cameras that took advantage of the full bandwidth. We'd had problems before with them not working right if we used an inferior USB cable. Therefore we were also sending each camera over a separate set of fiber optic repeaters (so 2 sets per machine). These fiber optic repeaters were new to us, but seemed to work fine in the lab (with a shorter piece of fiber). When we did the install, 4 of the 5 setups worked great. On one of them, the camera had an erratic and really slow frame rate. I spent 3 days trying to fix it. I changed out most of the component: the machine, the different sides of the repeaters, the cameras, the USB cables... Nothing. When I tried connecting known good setups from another machine to the machine with the problem or the problem setups to another machine that had worked fine, neither would work, although putting things back the way they were fixed the other machine that had been working. I was totally baffled. At one point, out of desperation I tried a procedure, uninstall the camera drivers, reboot, reinstall the camera drivers, reboot again. This actually seemed to make it work, for a little while, but then it would randomly break again after a few minutes. Finally we took a step back, went across the street, and discussed what we could do. My boss seemed to think that it had to just be one of the component and if we swap them out one at a time we could find/fix it. I went on a tirade about how that's what I had been doing and it must be 2 components and if there are roughly 15 suspect that makes the chance of finding it 1/15^2 = 1/225. Well, after we got back, it was miraculously still working. We'd been gone probably 30 mins since I had performed the procedure that usually only fixed it for a few minutes. However, only a few minutes after we returned, it broke again. I asked myself, "ok, what did I just do?" It turned out, bizarrely, that the camera worked fine when my USB flash drive was plugged it, and would break when it wasn't. The only reason that the uninstall/reinstall had been temporarily fixing it is because I kept the driver installer on the USB key. We found out that it was actually ok if any other USB2.0 device was plugged in so we bought a cheap, small USB2.0 hub. We were still never able to reproduce this problem in lab and neither the camera manufacturer nor the fiber optic repeater manufacturer could give us any explanation why this would happen. That was definitely my worst debugging experience, especially since the hardware setup wasn't even my job on that project.

Later on the same project, while testing the software, I encountered another mystery bug (which didn't end up being quite as mysterious when I finally did figure it out). We had one game that would occasionally crash, with no warning, error message, exception, error dialog, or even windows crash notification, it would just disappear. Our games were written in Python, but based heavily on a C++ platform (both the game engine and many of our libraries, including the computer vision). At first we thought we could rule out the problem being in the C++, because most of that stuff was shared with other games having no problems, plus we report pretty much all errors in there to an error log before quiting and any null pointer references, etc would generate a windows crash notification. The tricky part about debugging this is that we couldn't tell what was causing it to happen so we couldn't reproduce it reliably. The game was only 5 minutes long and sometimes it would crash after a minute and other times we could run it 10 times in a row with no crash. The game was controlled by a camera so I had to have someone out front waving their arms pretending to play. I'd put in a few debugging print statements to try and narrow down where the crash was, I'd run it, and then he'd have to wave his arms as long as it took to crash it. Each time we'd sort of think "Oh, I think it crashed because I made this kind of motion" but we were really just guessing and acting like fools. I had to be careful where I could put my debugging print statements because if they were inside certain loops it would slow down the game too much to effectively test. It took me forever (2 days?) to finally narrow it down to some 3rd party C code being used by one of the C++ computer vision library modules written by one of my coworkers who was out of the country. The boundary condition on a loop iterating over an image was missing a term or two so under the right circumstances it would read off the end of the array. Indeed, it was a windows protection violation, only I forgot that I'd turned off all windows crash notification, which is why I wasn't seeing those dialogs I had expected if it were that kind of bug. Oops.

There was yet a third mystery bug situation on this same project (in a different game). Like the last one, it was an intermittent crash that we didn't know how to reproduce. I don't remember as much of the details of this one. I remember there was some problem with even my debugging print statements (or maybe even the if statements the chose when to print and when not too). It was printing the wrong thing, or not printing at all, or something like that, and when I moved it into a different function it worked, or something. I don't remember, but I just remember having to do things I shouldn't have had to do. In the end it was a Vector class math issue where somehow one of the terms was becoming NaN or INF or something like that. It was because we were working in 2D with a 3D vector library and rounding errors were somehow accumulating on that untouched 3rd axis. Zeroing it out every frame fixed it, but finding it...

User avatar
'; DROP DATABASE;--
Posts: 3284
Joined: Thu Nov 22, 2007 9:38 am UTC
Location: Midwest Alberta, where it's STILL snowy
Contact:

Re: worst bugs ever (or your most hated)

Postby '; DROP DATABASE;-- » Thu Jan 01, 2009 6:33 am UTC

Read an image upside-down:

Code: Select all

for(u32 y=0; y<Height; y++)
{
   for(u32 x=0; x<Width; x++)
   {
      fread(Data, 1, 3, File);
      //BMP is stored upside-down so use (Height - y) here to flip it
      Image[(((Height - y) * Width) + x] = 0xFF000000 | (Data[0] << 16)
         | (Data[1] << 8) | Data[2];
   }
   
   //Align to 32 bits
   //fseek(File, 4 - (ftell(File) & 3), SEEK_CUR);
}
And the fun part is, it worked the first time. I once again want to troutslap the genius who decided to store bitmaps upside-down in the first place.
poxic wrote:You suck. And simultaneously rock. I think you've invented a new state of being.

User avatar
Iori_Yagami
Posts: 606
Joined: Wed Oct 03, 2007 8:37 pm UTC

Re: worst bugs ever (or your most hated)

Postby Iori_Yagami » Mon Jan 05, 2009 2:05 pm UTC

I once had to pass a list of ints to SQL stored procedure, and this was used to filter some rows in a WHERE clause. There are probably better alternatives (temporary #Tables, XML <parameters>, EXECUTE 'blah-blah-blah', ...) but I made it like this:
format a string like

Code: Select all

',3,5,65,54,86,'
(so as not to catch 86 when I search for a 6), store it into @IntString.
test belonging like this:

Code: Select all

WHERE ... CHARINDEX(',' + LTRIM(STR(tbl.IntField, 2)) + ',', @IntString) <> 0 ...
Cool, no?
Except I failed to notice that list that was passed to me was not made of ints, but rather a user written list. I stripped all [^0-9,] stuff. I collapsed multiple ',' into one, too.
It seemed OK.
It worked...
Several WEEKS later we encounter a weird bug. Something is not updating properly in tables when my procedure recalculates it. What???
...Can you guess what was wrong?
It took amazing detective skill to find out the root cause, though...

Spoiler:
The user string contained '54,03,4,06,11'
That'll teach me to not to cast back and forward from ints to format it correctly... :mrgreen:
They cannot defend themselves; they cannot run away. INSANITY is their only way of escape.

User avatar
'; DROP DATABASE;--
Posts: 3284
Joined: Thu Nov 22, 2007 9:38 am UTC
Location: Midwest Alberta, where it's STILL snowy
Contact:

Re: worst bugs ever (or your most hated)

Postby '; DROP DATABASE;-- » Tue Jan 13, 2009 6:27 am UTC

Is it also a problem that this new string doesn't begin and end with a comma?
poxic wrote:You suck. And simultaneously rock. I think you've invented a new state of being.

Pidgeot
Posts: 10
Joined: Fri Oct 17, 2008 10:40 am UTC

Re: worst bugs ever (or your most hated)

Postby Pidgeot » Thu Jan 15, 2009 4:01 am UTC

Several years ago, a friend of mine was writing some simple C program. However, it segfaulted when he ran it (on a Linux box). He couldn't find the bug, so he sent me a copy of the source. Couldn't spot the bug. I tried to compile and run it (Cygwin). No crash.

After doing some digging, we eventually found out that it crashed when it was supposed to do a printf("%d") (I think it was %d, at least - not that it matters), but there wasn't anything in the code that looked even slightly wrong.

This really piqued my curiosity, so I got a compiled binary, loaded it into a disassembler, and started looking; going through the assembly output and the source code side by side.

After a while, I tracked down the issue to what can only have been a genuine compiler bug: the program had two calls to a printf("%d"), but the faulty compile didn't actually call the same function - it performed a jump to an different, incorrect overload of printf, and this wound up causing the crash. He was compiling on a beta version of GCC that apparently caused this issue.

jb17kx
Posts: 14
Joined: Sun Sep 07, 2008 6:49 am UTC
Location: Melbourne

Re: worst bugs ever (or your most hated)

Postby jb17kx » Fri Jan 16, 2009 6:56 am UTC

Recently I was writing a PHP script that generated a quote for a hypothetical online photo printing service (actually a class).

The algorithm was simple, just collecting a user's selection of quantities and qualities, etc, then doing the maths, checking if they'd supplied a valid discount code, then generating the quote and, if requested, emailing it.

This all worked nice and well, passing all testing until quite late in development, when it decided that if the hypothetical customer decided to select a certain combination of variables their quote would be several times what it should be.

We never did work out why - everything seemed shipshape, the monetary variables and constants were right, and I wasn't accidentally multiplying by system uptime in hours. In the end I recoded that section of the algorithm, cleaning up a few small inelegant methodologies (none of which should have caused the issue) and it all went away.

Then another time the teacher couldn't work out why my code for another project kept returning the same value (4) when it was meant to be generating and working with a random number. He called it a bug until he checked the source and found 4 hard-coded in - then he called it crap.

I preferred to think of it as an Easter Egg for xkcd fans.
How often do you walk up to a bar and say "I'll have two of your finest boobs, thanks"?

User avatar
Why Two Kay
Posts: 266
Joined: Sun Mar 23, 2008 6:25 pm UTC
Location: Plano, TX
Contact:

Re: worst bugs ever (or your most hated)

Postby Why Two Kay » Fri Jan 16, 2009 2:07 pm UTC

jb17kx wrote:... and I wasn't accidentally multiplying by system uptime in hours...


This is... a common problem... for you?
tl;dr - I said nothing important.

jb17kx
Posts: 14
Joined: Sun Sep 07, 2008 6:49 am UTC
Location: Melbourne

Re: worst bugs ever (or your most hated)

Postby jb17kx » Fri Jan 16, 2009 9:57 pm UTC

Why Two Kay wrote:This is... a common problem... for you?


I did once give that and another unrelated variable unfortunately similar names - but I worked that out as soon as the error showed itself. :oops:
How often do you walk up to a bar and say "I'll have two of your finest boobs, thanks"?

User avatar
mrbaggins
Posts: 1611
Joined: Tue Jan 15, 2008 3:23 am UTC
Location: Wagga, Australia

Re: worst bugs ever (or your most hated)

Postby mrbaggins » Fri Jan 16, 2009 10:33 pm UTC

similar sort of issue to multiplying by uptime...

Debugging someone elses java. They were trying to make dice roll. Every time they wanted a new roll, they made a whole new instance of the dice class, called it's roll method which had an RNG, then output. Testing the code in actual use worked ok, but mass simulation (100~ rolls per second) got results like 44444444444444444333333333333333336666666666666666666666666 etc...

Problem? New class meant the seed value was reset to be based off the magic number of seconds since 00:00:00 Jan 1 1970. After one roll, normally the RNG uses the previous value as part of the new seed. Because it was being destroyed and reinitiated many times per second though, the seed was the same for up to a second at a time.

Changed the code solely to be more correct (one instance of dice class kept permanently) which also guaranteed the problem never showed up in the code.
Why is it that 4chan is either infinitely awesome, infinitely bad, or "lolwut", but never any intermediary level?

User avatar
'; DROP DATABASE;--
Posts: 3284
Joined: Thu Nov 22, 2007 9:38 am UTC
Location: Midwest Alberta, where it's STILL snowy
Contact:

Re: worst bugs ever (or your most hated)

Postby '; DROP DATABASE;-- » Sun Jan 18, 2009 11:42 am UTC

That was a fucking epic one. Continuing on my PSP app, added some background effects and suddenly it would crash when I opened a certain menu. I spent 4 hours trying different things and there didn't seem to be any common denominator. No memory problems, timing issues, invalid data, bad pointers, anything... the best I could track it down was to the entire function that drew the background images. I commented out that function and it worked, but if any line were left uncommented, it would crash after an arbitrary number of frames. Even simple things like pushing a matrix and immediately popping it again, disabling things that were already disabled, doing a translation or rotation even though the matrices would immediately be reset afterward... anything would crash it.

I was actually just in the process of writing a small rant about it when I realized how all of those things were connected. They all issue at least one command to the GPU. Those commands get put into a display list. In RAM. Which was dynamically allocated.

When I turned it back on the battery meter was actually at 0%, so I quickly uncommented all those lines, increased the size of the display list, and ran it... and I kid you not it ran just long enough for me to see it render perfectly and see the battery meter reading one minute, and then died. The coding equivalent of climbing a hill with the last drop of fuel and then coasting down it into the gas station. :D

(And I think I'll change that buffer from 4K to, oh, 512 or so. >_>)
poxic wrote:You suck. And simultaneously rock. I think you've invented a new state of being.

elminster
Posts: 1560
Joined: Mon Feb 26, 2007 1:56 pm UTC
Location: London, UK, Dimensions 1 to 42.
Contact:

Re: worst bugs ever (or your most hated)

Postby elminster » Sun Jan 18, 2009 4:39 pm UTC

Worst bug I dealt with was on a private server of a mmorpg. It randomly occurred, nearly impossible to reproduce through testing (Despite happening often with the full userbase playing) and was a slightly different pattern each time.
Basically, the game had an inventory without fixed positions for items, which meant you can place them anywhere within a small screen allocated for you're inventory. There was an array of items and an array of XY locations for them. Occasionally, when changing server (Data was saved and passed over to the new server), the items would suddenly jumble up. Sometimes only one or two, sometimes loads, most of the times nones.
I found out, there was something mostly setting them a few to 0 and sometimes shifting the array values around. Then it happened at the same point or points in the array next server change as well, but seemed to be a random point in the array.

After much testing, surveying players, and going through code (For around 2 weeks, which wasn't long since that bug was around for years), I just decided to recode it all since it was trivial and would take less time (Especially since many other people failed at finding it as well). I had the new version ready and working 2 days later.
Image

User avatar
Iori_Yagami
Posts: 606
Joined: Wed Oct 03, 2007 8:37 pm UTC

Re: worst bugs ever (or your most hated)

Postby Iori_Yagami » Mon Jan 19, 2009 9:29 am UTC

'; DROP DATABASE;-- wrote:Is it also a problem that this new string doesn't begin and end with a comma?

Well, no. That was dealt with. As I say, the format was okay, only the testing function didn't know about tricky zeroes.
That is what is most exciting and annoying about programming - no matter how many details you manage think of beforehand , they just keep popping up! :mrgreen:
They cannot defend themselves; they cannot run away. INSANITY is their only way of escape.

User avatar
Berengal
Superabacus Mystic of the First Rank
Posts: 2707
Joined: Thu May 24, 2007 5:51 am UTC
Location: Bergen, Norway
Contact:

Re: worst bugs ever (or your most hated)

Postby Berengal » Mon Jan 19, 2009 6:23 pm UTC

Not a bad bug at all, but one I really hate: Whenever I get to the point of actually displaying stuff in pygame, it doesn't work, and I have to spend lots of time figuring it out. It's always something silly, but never the same.
It is practically impossible to teach good programming to students who are motivated by money: As potential programmers they are mentally mutilated beyond hope of regeneration.

User avatar
You, sir, name?
Posts: 6983
Joined: Sun Apr 22, 2007 10:07 am UTC
Location: Chako Paul City
Contact:

Re: worst bugs ever (or your most hated)

Postby You, sir, name? » Mon Jan 19, 2009 9:33 pm UTC

Right now I'm stricken with a bug that's pretty damn mysterious.

I have a struct (in C) with various function pointers, that take one or two arguments. At some point in the code, the address to the first function argument ends up in the function pointer so when I invoke it I end up in the memory reserved for this (former) argument.

Don't you just love when GDB tells you this is the problem?
#0 0x00000000006053d0 in ?? ()
#1 0x0000000000401434 in auto_merge (cell=0x6053d0, cell2=0x605670) at type.c:110
#2 0x000000000040073a in main (argc=1, argv=0x7fff0ae38d38) at main.c:17

where "auto_merge" basically checks if the function pointer is set, and if so calls it with cell and cell2 as arguments.
I edit my posts a lot and sometimes the words wrong order words appear in sentences get messed up.

User avatar
'; DROP DATABASE;--
Posts: 3284
Joined: Thu Nov 22, 2007 9:38 am UTC
Location: Midwest Alberta, where it's STILL snowy
Contact:

Re: worst bugs ever (or your most hated)

Postby '; DROP DATABASE;-- » Tue Jan 20, 2009 2:44 am UTC

Berengal wrote:Not a bad bug at all, but one I really hate: Whenever I get to the point of actually displaying stuff in pygame, it doesn't work, and I have to spend lots of time figuring it out. It's always something silly, but never the same.
For me, it's 3D. Any time I try to draw some simple 3D graphics, I inevitably end up spending a good while screwing with constants and settings and trying everything even though I triple-checked the code and logic and it should have worked the first time. Eventually I randomly try something that didn't work before, and now it works. Except the resulting polygons won't be in quite the right place, and I'll have to add a couple magic numbers (found by trial and error) to a couple constants to make it work.
Then I go to add something else, and get the same problem.
poxic wrote:You suck. And simultaneously rock. I think you've invented a new state of being.

Carnildo
Posts: 2023
Joined: Fri Jul 18, 2008 8:43 am UTC

Re: worst bugs ever (or your most hated)

Postby Carnildo » Tue Jan 20, 2009 3:37 am UTC

You, sir, name? wrote:Right now I'm stricken with a bug that's pretty damn mysterious.

I have a struct (in C) with various function pointers, that take one or two arguments. At some point in the code, the address to the first function argument ends up in the function pointer so when I invoke it I end up in the memory reserved for this (former) argument.

Don't you just love when GDB tells you this is the problem?
#0 0x00000000006053d0 in ?? ()
#1 0x0000000000401434 in auto_merge (cell=0x6053d0, cell2=0x605670) at type.c:110
#2 0x000000000040073a in main (argc=1, argv=0x7fff0ae38d38) at main.c:17

where "auto_merge" basically checks if the function pointer is set, and if so calls it with cell and cell2 as arguments.


That looks similar to a bug I posted earlier. Does it go away if you compile without optimization? If so, check for a compiler bug.

User avatar
You, sir, name?
Posts: 6983
Joined: Sun Apr 22, 2007 10:07 am UTC
Location: Chako Paul City
Contact:

Re: worst bugs ever (or your most hated)

Postby You, sir, name? » Tue Jan 20, 2009 10:51 am UTC

Carnildo wrote:
You, sir, name? wrote:Right now I'm stricken with a bug that's pretty damn mysterious.

I have a struct (in C) with various function pointers, that take one or two arguments. At some point in the code, the address to the first function argument ends up in the function pointer so when I invoke it I end up in the memory reserved for this (former) argument.

Don't you just love when GDB tells you this is the problem?
#0 0x00000000006053d0 in ?? ()
#1 0x0000000000401434 in auto_merge (cell=0x6053d0, cell2=0x605670) at type.c:110
#2 0x000000000040073a in main (argc=1, argv=0x7fff0ae38d38) at main.c:17

where "auto_merge" basically checks if the function pointer is set, and if so calls it with cell and cell2 as arguments.


That looks similar to a bug I posted earlier. Does it go away if you compile without optimization? If so, check for a compiler bug.


I was thinking along those lines. It really looks like it could be compiler-related, since this is what I'd expect to see if stack got mangled and it decided to return to a local variable or something. But I don't think this is the case, since the value of the function pointer actually does change at some point. It could still be compiler related, but I'm leaning towards PEBCAK. I'll be spending the night debugging the assembly output to figure out where the problem lies. Oh yes. Some people sleep or have social lives I hear.
I edit my posts a lot and sometimes the words wrong order words appear in sentences get messed up.

kmatzen
Posts: 214
Joined: Thu Nov 15, 2007 2:55 pm UTC
Location: Ithaca, NY

Re: worst bugs ever (or your most hated)

Postby kmatzen » Tue Jan 20, 2009 2:47 pm UTC

You, sir, name? wrote:
Carnildo wrote:
You, sir, name? wrote:Right now I'm stricken with a bug that's pretty damn mysterious.

I have a struct (in C) with various function pointers, that take one or two arguments. At some point in the code, the address to the first function argument ends up in the function pointer so when I invoke it I end up in the memory reserved for this (former) argument.

Don't you just love when GDB tells you this is the problem?
#0 0x00000000006053d0 in ?? ()
#1 0x0000000000401434 in auto_merge (cell=0x6053d0, cell2=0x605670) at type.c:110
#2 0x000000000040073a in main (argc=1, argv=0x7fff0ae38d38) at main.c:17

where "auto_merge" basically checks if the function pointer is set, and if so calls it with cell and cell2 as arguments.


That looks similar to a bug I posted earlier. Does it go away if you compile without optimization? If so, check for a compiler bug.


I was thinking along those lines. It really looks like it could be compiler-related, since this is what I'd expect to see if stack got mangled and it decided to return to a local variable or something. But I don't think this is the case, since the value of the function pointer actually does change at some point. It could still be compiler related, but I'm leaning towards PEBCAK. I'll be spending the night debugging the assembly output to figure out where the problem lies. Oh yes. Some people sleep or have social lives I hear.


I guess I'm missing something. Why can't you set a watchpoint and see when the function pointer is being replaced?

User avatar
You, sir, name?
Posts: 6983
Joined: Sun Apr 22, 2007 10:07 am UTC
Location: Chako Paul City
Contact:

Re: worst bugs ever (or your most hated)

Postby You, sir, name? » Tue Jan 20, 2009 7:00 pm UTC

kmatzen wrote:
You, sir, name? wrote:
Spoiler:
Carnildo wrote:
You, sir, name? wrote:Right now I'm stricken with a bug that's pretty damn mysterious.

I have a struct (in C) with various function pointers, that take one or two arguments. At some point in the code, the address to the first function argument ends up in the function pointer so when I invoke it I end up in the memory reserved for this (former) argument.

Don't you just love when GDB tells you this is the problem?
#0 0x00000000006053d0 in ?? ()
#1 0x0000000000401434 in auto_merge (cell=0x6053d0, cell2=0x605670) at type.c:110
#2 0x000000000040073a in main (argc=1, argv=0x7fff0ae38d38) at main.c:17

where "auto_merge" basically checks if the function pointer is set, and if so calls it with cell and cell2 as arguments.


That looks similar to a bug I posted earlier. Does it go away if you compile without optimization? If so, check for a compiler bug.

I was thinking along those lines. It really looks like it could be compiler-related, since this is what I'd expect to see if stack got mangled and it decided to return to a local variable or something. But I don't think this is the case, since the value of the function pointer actually does change at some point. It could still be compiler related, but I'm leaning towards PEBCAK. I'll be spending the night debugging the assembly output to figure out where the problem lies. Oh yes. Some people sleep or have social lives I hear.


I guess I'm missing something. Why can't you set a watchpoint and see when the function pointer is being replaced?


That's what I'm doing. It's absolutely crazy. I've nailed the change in value of the variable down to a completely unrelated call to malloc. In fact,

0x400a8b mov -0x14(%rbp),%edi
0x400a8e callq 0x4005a0 <malloc@plt>
0x400a93 mov %rax,%rdx
0x400a96 mov -0x8(%rbp),%rax
0x400a9a mov %rdx,(%rax)

Here is the exact instruction that makes everything mess up.

So, in short, the problem is as follows:
  1. I have a function pointer that I know is valid at some point.
  2. I run malloc in an entirely different section of the code, that has no access to the function pointer or anything that has access to it.
  3. The function pointer now points to the allocated area.

--edit--

Aaah. I found it. Turned out the struct that held the function pointer was allocated using sizeof() the wrong type. And now I have a nice head-shape indent in the wall to remember not to make this mistake again by.
I edit my posts a lot and sometimes the words wrong order words appear in sentences get messed up.

Carnildo
Posts: 2023
Joined: Fri Jul 18, 2008 8:43 am UTC

Re: worst bugs ever (or your most hated)

Postby Carnildo » Wed Jan 21, 2009 4:14 am UTC

kmatzen wrote:
You, sir, name? wrote:
Carnildo wrote:
You, sir, name? wrote:Right now I'm stricken with a bug that's pretty damn mysterious.

I have a struct (in C) with various function pointers, that take one or two arguments. At some point in the code, the address to the first function argument ends up in the function pointer so when I invoke it I end up in the memory reserved for this (former) argument.

Don't you just love when GDB tells you this is the problem?
#0 0x00000000006053d0 in ?? ()
#1 0x0000000000401434 in auto_merge (cell=0x6053d0, cell2=0x605670) at type.c:110
#2 0x000000000040073a in main (argc=1, argv=0x7fff0ae38d38) at main.c:17

where "auto_merge" basically checks if the function pointer is set, and if so calls it with cell and cell2 as arguments.


That looks similar to a bug I posted earlier. Does it go away if you compile without optimization? If so, check for a compiler bug.


I was thinking along those lines. It really looks like it could be compiler-related, since this is what I'd expect to see if stack got mangled and it decided to return to a local variable or something. But I don't think this is the case, since the value of the function pointer actually does change at some point. It could still be compiler related, but I'm leaning towards PEBCAK. I'll be spending the night debugging the assembly output to figure out where the problem lies. Oh yes. Some people sleep or have social lives I hear.


I guess I'm missing something. Why can't you set a watchpoint and see when the function pointer is being replaced?


You can only set watchpoints on memory. With my bug, the function pointer was being clobbered when a register was inappropriately popped off the stack.

User avatar
[.root/fail]
Posts: 194
Joined: Fri Jul 11, 2008 9:41 pm UTC
Location: middle of nowhere
Contact:

Re: worst bugs ever (or your most hated)

Postby [.root/fail] » Thu Jan 22, 2009 1:29 am UTC

Windows ME...
Yes, well that's a compelling argument; but you forget one fundamental fact, you suck!

[.root/fail] wrote:Only a loser would sig themselves...

User avatar
'; DROP DATABASE;--
Posts: 3284
Joined: Thu Nov 22, 2007 9:38 am UTC
Location: Midwest Alberta, where it's STILL snowy
Contact:

Re: worst bugs ever (or your most hated)

Postby '; DROP DATABASE;-- » Thu Jan 22, 2009 7:23 am UTC

*cheeseburn*

Hm, that might explain earlier issues I'd had with GDB's watchpoints. It wouldn't catch variables being changed if they're in a register at the time.
poxic wrote:You suck. And simultaneously rock. I think you've invented a new state of being.

btilly
Posts: 1877
Joined: Tue Nov 06, 2007 7:08 pm UTC

Re: worst bugs ever (or your most hated)

Postby btilly » Sat Jan 24, 2009 6:13 am UTC

At work I now have a Windows machine where if Notepad makes a file on my Desktop, Notepad can see the file but it doesn't show up on my Desktop and Outlook can't find it.

I made the same file in another folder and it worked fine.
Some of us exist to find out what can and can't be done.

Others exist to hold the beer.

Onion_Knight
Posts: 22
Joined: Tue Jul 22, 2008 9:13 pm UTC

Re: worst bugs ever (or your most hated)

Postby Onion_Knight » Mon Jan 26, 2009 12:01 am UTC

When I was in my first year, I wrote the following code:

Code: Select all

for( l = 0; 1<4; l++)
    {/*Code goes here*/}


The problem was, on the terminal the computers used, l and 1 looked identical, as opposed to almost identical here.
That little problem took about 4 hours to find, and ended when I re-wrote that line just in case. :(
-Onion Knight

User avatar
Xanthir
My HERO!!!
Posts: 5426
Joined: Tue Feb 20, 2007 12:49 am UTC
Location: The Googleplex
Contact:

Re: worst bugs ever (or your most hated)

Postby Xanthir » Mon Jan 26, 2009 1:52 am UTC

Onion_Knight wrote:When I was in my first year, I wrote the following code:

Code: Select all

for( l = 0; 1<4; l++)
    {/*Code goes here*/}


The problem was, on the terminal the computers used, l and 1 looked identical, as opposed to almost identical here.
That little problem took about 4 hours to find, and ended when I re-wrote that line just in case. :(

As a general rule, when you absolutely *cannot* find the bug on a line, rewriting it often helps.
(defun fibs (n &optional (a 1) (b 1)) (take n (unfold '+ a b)))

Rysto
Posts: 1460
Joined: Wed Mar 21, 2007 4:07 am UTC

Re: worst bugs ever (or your most hated)

Postby Rysto » Tue Jan 27, 2009 2:27 am UTC

I just spent the last several days at work trying to track down a horrific bug. What we were seeing was that several machines would try to boot and get in a state where they were totally locked up and the only thing that could be done was to pull the plug and try and start it up again. But once it happened once on a machine, it would always happen at boot.

The problem pointed at one of the most awful pieces of code I've ever seen. It was a driver for a piece of hardware we used. For the more technical among us, this driver was enabling interrupts before installing any interrupt handlers. For the less technical, that's the equivalent of giving somebody a phone number and asking them to call you at it if they have news for you, and then calling up the phone company and asking them to install a phone at your house with that number. (If you don't get what's so bad about this, what happens if your friend tries to call you before the phone company installs the new phone line?)

In the end, it wasn't the driver's fault at all, despite it being so awful. No, what was happening was a godammned Y2K bug(seriously!). If a machine lost power, it's internal clock would get set back to Dec 31, 99. A little time would pass and the clock would increment to Jan 1, 00, and all hell would break lose. The only way to fix the problem was to pull the plug... but then the internal clock would lose power and get set back to Dec 31, 99, which would in short order go up to Jan 1, 00, etc, etc.

User avatar
biolution
Ken
Posts: 560
Joined: Wed Sep 05, 2007 10:05 pm UTC
Location: San Francisco, Ca
Contact:

Re: worst bugs ever (or your most hated)

Postby biolution » Tue Jan 27, 2009 4:33 am UTC

That is so awesome. A real y2k bug!

douglasm
Posts: 630
Joined: Mon Apr 21, 2008 4:53 am UTC

Re: worst bugs ever (or your most hated)

Postby douglasm » Thu Jan 29, 2009 6:34 pm UTC

Wow. Um, why was the internal clock storing the time as day/month/two-digit-year instead of the standard (milli)seconds since 1970? That format should never ever ever be used for anything but display and input and possibly storage of dates always entered in that format. Not for internal storage of the current time.

User avatar
Marz
Posts: 156
Joined: Mon Dec 10, 2007 9:13 pm UTC
Location: UK
Contact:

Re: worst bugs ever (or your most hated)

Postby Marz » Thu Jan 29, 2009 7:42 pm UTC

douglasm wrote:Wow. Um, why was the internal clock storing the time as day/month/two-digit-year instead of the standard (milli)seconds since 1970? That format should never ever ever be used for anything but display and input and possibly storage of dates always entered in that format. Not for internal storage of the current time.

But come 2038 all 32-bit systems start getting interesting.
But yeah, that is a pretty crazy design.

Rysto
Posts: 1460
Joined: Wed Mar 21, 2007 4:07 am UTC

Re: worst bugs ever (or your most hated)

Postby Rysto » Fri Jan 30, 2009 12:25 am UTC

douglasm wrote:Wow. Um, why was the internal clock storing the time as day/month/two-digit-year instead of the standard (milli)seconds since 1970? That format should never ever ever be used for anything but display and input and possibly storage of dates always entered in that format. Not for internal storage of the current time.

I don't know if the internal clock was actually storing the date in that format. All I know is that when the year goes from 99 to 00, Intel ICH parts generate an interrupt, and there was no handler in the system to clear it.

btilly
Posts: 1877
Joined: Tue Nov 06, 2007 7:08 pm UTC

Re: worst bugs ever (or your most hated)

Postby btilly » Sat Jan 31, 2009 8:39 am UTC

Marz wrote:But come 2038 all 32-bit systems start getting interesting.

How hard can it be to make a signed int into an unsigned int? That pushes the problem off for close to 70 more years...

BTW the bug affects 64-bit systems as well because the 32-bit layout for time shows up in places like network protocols and filesystem layouts.
Some of us exist to find out what can and can't be done.

Others exist to hold the beer.

Carnildo
Posts: 2023
Joined: Fri Jul 18, 2008 8:43 am UTC

Re: worst bugs ever (or your most hated)

Postby Carnildo » Sat Jan 31, 2009 11:19 am UTC

btilly wrote:
Marz wrote:But come 2038 all 32-bit systems start getting interesting.

How hard can it be to make a signed int into an unsigned int?


Very. If you turn time_t from signed to unsigned, come January 19, 2038, all programs compiled with the signed version will think it's December 13, 1901. Programs that use time_ts to store time differences will need to be rewritten because they'll no longer be able to store negative differences. Dates stored on-disk as time_ts won't have problems, but time differences will need to be re-interpreted.

User avatar
hotaru
Posts: 1045
Joined: Fri Apr 13, 2007 6:54 pm UTC

Re: worst bugs ever (or your most hated)

Postby hotaru » Sat Jan 31, 2009 3:32 pm UTC

Carnildo wrote:
btilly wrote:
Marz wrote:But come 2038 all 32-bit systems start getting interesting.

How hard can it be to make a signed int into an unsigned int?


Very. If you turn time_t from signed to unsigned, come January 19, 2038, all programs compiled with the signed version will think it's December 13, 1901. Programs that use time_ts to store time differences will need to be rewritten because they'll no longer be able to store negative differences. Dates stored on-disk as time_ts won't have problems, but time differences will need to be re-interpreted.

of course if you make time_t a 64-bit signed integer, you don't have those problems. unless programs do stupid things like assuming that time_t is only 32 bits...

Code: Select all

factorial product enumFromTo 1
isPrime n 
factorial (1) `mod== 1

User avatar
phlip
Restorer of Worlds
Posts: 7573
Joined: Sat Sep 23, 2006 3:56 am UTC
Location: Australia
Contact:

Re: worst bugs ever (or your most hated)

Postby phlip » Sat Jan 31, 2009 4:47 pm UTC

hotaru wrote:of course if you make time_t a 64-bit signed integer, you don't have some those problems. unless programs do stupid things like assuming that time_t is only 32 bits... or attempting to maintain backwards compatibility with a binary file-format/file-system/network protocol/etc that requires 4 bytes for the date... or run on an embedded system that hasn't had to be redesigned for years (and will probably continue to be used for decades), and doesn't handle 64-bit ints...

Fixed.

Code: Select all

enum ಠ_ಠ {°□°╰=1, °Д°╰, ಠ益ಠ╰};
void ┻━┻︵​╰(ಠ_ಠ ⚠) {exit((int)⚠);}
[he/him/his]

User avatar
biolution
Ken
Posts: 560
Joined: Wed Sep 05, 2007 10:05 pm UTC
Location: San Francisco, Ca
Contact:

Re: worst bugs ever (or your most hated)

Postby biolution » Sat Jan 31, 2009 7:55 pm UTC

4 bytes should be enough for -anybody-. Next thing you're going to ask for is more than a megabyte of ram, huh? Damn kids.

User avatar
You, sir, name?
Posts: 6983
Joined: Sun Apr 22, 2007 10:07 am UTC
Location: Chako Paul City
Contact:

Re: worst bugs ever (or your most hated)

Postby You, sir, name? » Sat Jan 31, 2009 9:23 pm UTC

biolution wrote:4 bytes should be enough for -anybody-. Next thing you're going to ask for is more than a megabyte of ram, huh? Damn kids.


4 BYTES? We used to dream of 4 bytes. There were 16 of us crammed into 2 bytes, and the computer developed such heat that it was necessary for it to be cooled in liquid hydrogen, so when we woke up in the morning, we had to amputate all our arms and legs because it was so cold, so there we flopped around like limbless torsos, dragging ourselves around with our teeth while constantly getting electric shocks. Hah! 4 bytes.
I edit my posts a lot and sometimes the words wrong order words appear in sentences get messed up.

btilly
Posts: 1877
Joined: Tue Nov 06, 2007 7:08 pm UTC

Re: worst bugs ever (or your most hated)

Postby btilly » Mon Feb 02, 2009 2:31 pm UTC

Carnildo wrote:
btilly wrote:
Marz wrote:But come 2038 all 32-bit systems start getting interesting.

How hard can it be to make a signed int into an unsigned int?

Very. If you turn time_t from signed to unsigned, come January 19, 2038, all programs compiled with the signed version will think it's December 13, 1901. Programs that use time_ts to store time differences will need to be rewritten because they'll no longer be able to store negative differences. Dates stored on-disk as time_ts won't have problems, but time differences will need to be re-interpreted.

The fact that unchanged programs will break is unavoidable no matter what solution you use. I had not thought of the issue of using time_t to store a time difference, and that is a good one. However binary compatibility with things like filesystem formats is nice, and I predict that there will be at least some signed to unsigned mappings for that reason.
Some of us exist to find out what can and can't be done.

Others exist to hold the beer.


Return to “Coding”

Who is online

Users browsing this forum: No registered users and 7 guests