Crash in atheme-services!
While stuffing about trying to upgrade this ancient infrastructure, Atheme promptly exploded in a core-dump. I guessed (correctly) that it was related to the old configuration file being used on a newer version, rewrote the configuration file, and off we went. After I got everything mostly squared away, I decided to come back and take a look and see if there was anything interesting to it.
I built it on my local box, installed ratbox, hacked up the configuration file and… boom. Core files aren’t dropping on my box, so I just ran it inside of gdb, to catch the backtrace:
Program received signal SIGSEGV, Segmentation fault.
__strcmp_sse2_unaligned ()
at ../sysdeps/x86_64/multiarch/strcmp-sse2-unaligned.S:29
29 ../sysdeps/x86_64/multiarch/strcmp-sse2-unaligned.S: No such file or directory.
(gdb) bt
#0 __strcmp_sse2_unaligned ()
at ../sysdeps/x86_64/multiarch/strcmp-sse2-unaligned.S:29
#1 0x00007ffff5ffe2a0 in m_pass (si=<optimized out>, parc=4,
parv=0x7fffffffdb30) at ts6-generic.c:1202
#2 0x00007ffff5df9cff in irc_parse (line=<optimized out>) at parse.c:176
#3 0x00007ffff772ea1e in irc_recvq_handler (cptr=<optimized out>)
at packet.c:55
#4 0x00007ffff7726581 in recvq_put (cptr=0x607b30) at datastream.c:266
#5 0x00007ffff798c713 in mowgli_epoll_eventloop_select (
eventloop=0x7ffff7e43f70, delay=<optimized out>) at epoll_pollops.c:188
#6 0x00007ffff798d6a4 in mowgli_simple_eventloop_timeout_once (
eventloop=0x7ffff7e43f70, timeout=<optimized out>) at null_pollops.c:57
#7 0x00007ffff798cd94 in mowgli_eventloop_run_once (eventloop=0x7ffff7e43f70)
at eventloop.c:131
#8 0x00007ffff773480d in io_loop () at send.c:77
#9 0x00007ffff771a957 in atheme_main (argc=<optimized out>,
argv=0x7fffffffe338) at atheme.c:442
#10 0x00007ffff737bb45 in __libc_start_main (main=0x4005c0 <main>, argc=2,
argv=0x7fffffffe338, init=<optimized out>, fini=<optimized out>,
rtld_fini=<optimized out>, stack_end=0x7fffffffe328) at libc-start.c:287
#11 0x00000000004005ee in _start ()
(gdb)
Frame #1 suggests it’s in m_pass, which if atheme is anything like an IRC, means the function despatched to handle PASS messages. Knowing what I know about the configuration file that triggered it, I suspect I know where this is going, and sure enough line 1202 of ts6-generic.c looks like this:
if (strcmp(curr_uplink->receive_pass, parv[0]))
At this point I was pretty sure curr_uplink->receive_pass
is a null pointer, but I hacked in some debug printf()s to be sure and yep:
[2017-07-12 22:22:13] uplink_add(): services.int -> hades.arpa:6667 - 14584490 0
The first is send_password, which was set, the second, is receive_password, which wasn’t on this old configuration (because at the time it was set up, Atheme didn’t check the password the server sent). I thought that it was perhaps simply not checked, but in fact none of these things are… A null send_pass simply doesn’t do anything because it’s expanded to (null) when the services sends it’s PASS message:
[2017-07-12 22:33:14] <- PASS (null) TS 6 :00A
… which we get away with. So it’s probably not even worth fixing - how many people are going to run into a situation where they have the exact same brain-dead configuration file we had? I reported it in the Atheme IRC channel and that’ll do.
Update: @grawity noted in IRC that invalid configuration files shouldn’t cause a crash, and filed a pull request to fix it. I’m rather happy that despite the fact I know very little about the Atheme codebase these days, it’s about what I’d have done to fix the issue too (though I hadn’t noticed it affects other protocols as well).