Now WTH is Blind Debugging? Well, thats what I have been doing for the past 2 weeks everyday for 14-16 hours a day! I am working on a new top secret project(!) where there is a lot of intercommunication happening between 2 systems. One of them is a master and one is a slave. The intercommunication happens with a list of complicated protocols. Now comes the bad part. I don’t have source code access to either of the systems and my task is to get some things working in both the systems!
And that my dear friends is what I call blind debugging. All I can do is look at the kernel logs and look at the other systems logs and try to figure out what is happening. Of course both the systems are very unstable and all the logs are coming over UARTs. And there is no time sync and a lot of log drops because of buffer overflows. So how do we go about doing the blind debugging?
For the particular problem I faced, I started out with creating user-space programs that could start dumping everything received on a particular io node. And then another system engineer got me a binary which could print everything that it was sending from the user space. Collect the painstaking log hopping that some of the prints are not lost (for the other system) and do a painstaking byte comparison. Of course it is much better since they are all using the same endianness :).
In short found out that the kernel driver was stripping off some more bytes from the protocol instead and giving me incomplete data on the io node. But to rant about it, it is painful and bloody dump to try to do this. Well, thats work! Lets see what I face tomorrow…