IIIInnnneeeesssssssseeeennnnttttiiiiaaaallll kkkkeeeerrrrnnnneeeellll ________________________ John Hawkinson Student Information Processing Board Massachusetts Institute of Technology 84 Massachusetts Avenue Cambridge, Massachusetts 02139 _A_B_S_T_R_A_C_T This document provides a general guide to common operations you might want to perform on kernels. This is targeted at the platforms I use, including SunOS, IRIX, NetBSD, and Solaris, with a smattering of IOS. "Common operations" include patching variables, patching code, and debugging crashes. Todo-type notes. This document probably needs to finish being written before it gets too much commentary. I've incorporated my responses to aurora & katyking's comments scribbled on the 3 October edition. I've ignored some of them in the interest of maintaing conversational jhawk-style documentation, but implemented some others. 1 December 1996 IIIInnnneeeesssssssseeeennnnttttiiiiaaaallll kkkkeeeerrrrnnnneeeellll ________________________ John Hawkinson Student Information Processing Board Massachusetts Institute of Technology 84 Massachusetts Avenue Cambridge, Massachusetts 02139 _1. _I_n_t_r_o_d_u_c_t_i_o_n The kernel is the inner core of the UNIX |- operating system. The kernel is responsible for loading executable programs, maintaing processes, context-switching between them, and providing services to those programs through interfaces such as _s_y_s_t_e_mcalls and ttttrrrraaaappppssss. Because of these things, if you experience a fault or crash while the processor is executing in the kernel (often a "kernel panic"), the machine will require a reboot. Con- tast this with a user-level program which will exit (perhaps core dumping) and may be re-run without major disruption to the entire operating system. Kernels come in different shapes and sizes. Under clas- sic BSD-derived systems like SunOS and NetBSD, the kernel exists as one file in the root directory of the filesystem -- /_v_m_u_n_i_x /_n_e_t_b_s_d,,,, rrrreeeessssppppeeeeccccttttiiiivvvveeeellllyyyy.... UUUUnnnnddddeeeerrrr SSSSoooollllaaaarrrriiiissss,,,, tttthhhheeee kkkkeeeerrrrnnnneeeellll ccccoooonnnnssssiiiissssttttssss ooooffff aaaa llllaaaarrrrggggeeee nnnnuuuummmmbbbbeeeerrrr ooooffff ddddyyyynnnnaaaammmmiiiiccccaaaallllllllyyyy llllooooaaaaddddeeeedddd[1] ffffiiiilllleeeessss.... /_k_e_r_n_e_l/_u_n_i_x iiiissss tttthhhheeee pppprrrriiiinnnncccciiiippppaaaallll ppppaaaarrrrtttt ooooffff tttthhhheeee kkkkeeeerrrrnnnneeeellll,,,, aaaannnndddd iiiitttt llllooooaaaaddddssss vvvvaaaarrrriiiioooouuuussss ddddiiiiffffffffeeeerrrreeeennnntttt mmmmoooodddduuuulllleeeessss ffffrrrroooommmm ssssppppeeeecccciiiiffffiiiieeeedddd llllooooccccaaaattttiiiioooonnnnssss,,,, iiiinnnncccclllluuuuddddiiiinnnngggg ssssuuuubbbbddddiiiirrrreeeeccccttttoooorrrriiiieeeessss ooooffff /_k_e_r_n_e_l aaaannnndddd /_u_s_r/_k_e_r_n_e_l IIIIRRRRIIIIXXXX hhhhaaaassss iiiittttssss oooowwwwnnnn wwwwaaaayyyy ooooffff ddddooooiiiinnnngggg tttthhhhiiiinnnnggggssss tttthhhhaaaatttt IIII''''mmmm nnnnooootttt 111100000000%%%% cccclllleeeeaaaarrrr oooonnnn,,,, ssssoooo tttthhhhaaaatttt sssseeeeccccttttiiiioooonnnn iiiissss aaaa bbbbiiiitttt mmmmoooorrrreeee ttttoooo----bbbbeeee----wwwwrrrriiiitttttttteeeennnn tttthhhhaaaannnn tttthhhheeee rrrreeeesssstttt ((((bbbbuuuutttt tttthhhheeeerrrreeee''''ssss ssssoooommmmeeeetttthhhhiiiinnnngggg iiiinnnn /_u_n_i_x)))).... So why is this document named _I_n_e_s_s_e_n_t_i_a_l _k_e_r_n_e_l ______ instead of some more sensible name? Well, I couldn't decide what the ______ should represent[2], so it has been left up to your imagination. When the document title is pronounced, _________________________ |- UNIX is a trademark of Bell Laboratories. 9 [1] NetBSD and SunOS have support for dynamically loadable kernel modules, however they are not an integral part of the operating system, and typically require a user-level program to actually load the modules. 9 - 2 - the ______ should be accompanied by expansive hand-waving motions. _2. _E_x_a_m_i_n_i_n_g _y_o_u_r _k_e_r_n_e_l Before you start too far down the path of kernel hack- ing, you should take a moment to acquaint yourself with your surroundings. How large is your kernel? (Kernels typically range from 1 to 3 megabytes in size[3]) Do you have space for more than one copy of the kernel on your root partition (typically the part of the operating system responsible for loading the kernel, usually the "loader", can only load the kernel from the root partition of the system)? What version of your kernel is installed? Sometimes this is relatively easy to figure out, sometimes it is not. When the operating system boots this informaton is usually displayed; often it is logged by ssssyyyyssssllllooooggggdddd((((8888)))), or is available through a command that displays the kernel message buffer, such as ddddmmmmeeeessssgggg((((8888)))). It is also often useful to know who built your kernel. While you can't always run crying to them for help, chances are they are less than totally clueless about goings on. Under NetBSD, you can obtain the version of the running kernel and information about by whom and where the kernel was built with the _s_y_s_c_t_l(_8) command: [planet-zorp!jhawk] ~> sysctl kern.version kern.version = NetBSD 1.1 (ATHENAOTHER) #4: Thu Jan 18 03:29:37 EST 1996 ghudson@zygorthian-space-raiders:/u1/build/sys/arch/i386/compile/ATHENAOTHER Of course, it's often useful to find out information about a particular kernel image on disk, rather than the running kernel. You might expect the kernel build process to provide this information using the RCS $Tag$ convention, so it could be extracted with the _i_d_e_n_t(_1) command, however this is not the case. All is not lost, as the predecesor to the _i_d_e_n_t(_1) command -- the SCCS _w_h_a_t(_1) command -- may be used instead. _w_h_a_t(_1) searches for a string beginning with @(#) and displays the rest of the text. Again, under NetBSD: _________________________ 9 [2] Possible suggestions included "crap", "crud", and "stuff". None of these were particularly appealing. 9 [3] Kernels are typically built without debugging symbols. An unstripped kernel built with debugging symbols may be significantly larger, perhaps 13 megabytes in size. - 3 - [planet-zorp!jhawk] ~> what /netbsd /netbsd NetBSD 1.1 (ATHENAOTHER) #4: Thu Jan 18 03:29:37 EST 1996 Not all operating systems are as agreeable. More often than not, _w_h_a_t(_1) or _i_d_e_n_t(_1) will merely produce identifi- cation strings associated with modules linked into the ker- nel, some of which may not even be significant. Far more reliable are various permutations of the _s_t_r_i_n_g_s(_1) command. For instance: NetBSD: [planet-zorp!jhawk] ~> strings - /netbsd | grep -A1 '@(#)NetBSD' @(#)NetBSD 1.1 (ATHENAOTHER) #4: Thu Jan 18 03:29:37 EST 1996 ghudson@zygorthian-space-raiders:/u1/build/sys/arch/i386/compile/ATHENAOTHER [planet-zorp!jhawk] ~> SunOS: [all-purpose-gunk!jhawk] ~> strings - /vmunix | grep SunOS SunOS Release 4.1.4 (ALL-PURPOSE-GUNK) #4: Thu Apr 11 19:16:05 EDT 1996 Solaris (yes, this is pretty lame): [bart-savagewood!jhawk] ~> /usr/ccs/bin/what /kernel/unix /kernel/unix: SunOS 5.4 Generic 101945-37 December 1995 Of course, there are other things to examine than just versioning information. _n_m(_1) lists symbols inside an exe- cutable (the kernel is normally in a standard file-format for operating system executables). The start of the kernel typically contains a symbol table[4] which contains function and variable names and addresses. _n_m(_1) basically just dumps this information. For instance, here's some symbols under SunOS: 9_________________________ 9 [4] Under most kernels, you can hear the symbol table quite clearly at the start of the file if you listen to the kernel, eg: _c_a_t /_v_m_u_n_i_x > /_d_e_v/_a_u_d_i_o. Be prepared to adjust the volume down (consider using headphones) and to have others look at you strangely. - 4 - [all-purpose-gunk!jhawk] ~> nm /vmunix | grep _ip_f f001cfe4 T _ip_forward f01c7a50 D _ip_forwarding f001c69c T _ip_freef f0022310 T _ip_freemoptions _3. _B_o_o_t_i_n_g _t_h_e _k_e_r_n_e_l Under recent-vintage Sun 4 machines (i.e. SPARC machines) the OpenBOOT PROM monitor permits the following syntax for booting: boot _d_e_v_i_c_e _f_i_l_e_n_a_m_e Where _d_e_v_i_c_e defaults to _d_i_s_k and filename to /_v_m_u_n_i_x under for SunOS. Under NetBSD/i386, the boot loader displays a prompt as the machine is booting, where you're given a short period of time to respond before producing a spinning combination of |, /, -, and \, characters. It will default to booting /netbsd, but if you're quick you can type in another name (eg: /netbsd.new). Under NetBSD/SPARC you should just follow the previous instructions. _4. _B_u_i_l_d_i_n_g _a _k_e_r_n_e_l Building a kernel isn't necessary on all operating sys- tems (well, it may be _n_e_c_e_s_s_a_r_y, but in some cases you may not have that option without purchasing a source license, since some vendors don't ship enough source code...), but on some of them it really is, particularly SunOS and NetBSD (and other classic BSD systems). Don't even try it under Solaris, and the faint-of-heart may attempt IRIX. In the classic BSD system, the kernel source tree lives in /_s_y_s. Typically this is a symlink. Under SunOS, it leads to /_u_s_r/_k_v_m/_s_y_s and under NetBSD it leads to /_u_s_r/_s_r_c/_s_y_s. To compile a kernel, just cd to the appropriate directory ( /_s_y_s/_a_r_c_h/{_i_3_8_6,_s_p_a_r_c,_e_t_c.}/_c_o_n_f for NetBSD, /_s_y_s/{_s_u_n_4_m,_s_u_n_4_c}/_c_o_n_f) edit the kernel configuration file (kernel configuration files are tradionally named in all caps, like GENERIC and TELEGENIC), and type _c_o_n_f_i_g KKKKEEEERRRRNNNNEEEELLLL---- NNNNAAAAMMMMEEEE. So why would you actually want to rebuild your kernel? If you change the device configuration associated with your machine (port numbers, IRQs, slot numbers, etc.) you may need to rebuild your kernel. Perhaps more importantly, many operating systems come with stupid or insufficient default - 5 - values. Under SunOS, you might want to apply important secu- rity patches distributed by Sun (usually there are a boatload...). Under SunOS, even though you don't have full kernel sources, you can still rebuild the kernel. This is because Sun ships object files ready-to-be linked. This is why building a kernel under SunOS may take 3 minutes but build- ing a similar kernel under NetBSD may take an hour. Under SunOS, I typically apply the multicast patches (available from _f_t_p://_p_a_r_c_f_t_p._x_e_r_o_x._c_o_m/_p_u_b/_n_e_t- _r_e_s_e_a_r_c_h/_i_p_m_u_l_t_i/), as well as support for the Berkeley Packet Filter (_f_t_p://_f_t_p._e_e._l_b_l._g_o_v/_b_p_f._t_a_r._Z). I'm also careful to turn on UDP checksums[5], increase the size of _t_c_p_s_e_n_d_s_p_a_c_e and _t_c_p_r_e_c_v_s_p_a_c_e, and increase the TCP MSS (Maximum Segment Size) from the lame default of 512 to 1460[6]. To accomplish this, I add these lines to my config file[7]: # Various parameters in in_proto.c: options "TCPSENDSPACE=24*64)" # We live in a world of high bw*delay, options "TCPRECVSPACE=24*64)" # so lets try and deal. options "TCPDEFAULTMSS=1460" # Tcp max. segment size. options "UDPCKSUM=-1" # Udp checksums And then apply the following patch to /sys/netinet/in_proto.c: *** in_proto.c 1996/02/21 04:33:25 1.2 --- in_proto.c 1996/02/21 05:01:39 1.3 9_________________________ 9 [5] It used to be thought that UDP checksums weren't very important because packets wouldn't get corrupted very often and they were expensive to compute. At this point computation isn't very expensive at all, and not having UDP checksums turned on can lead to corrupted data in surprising places, like the Domain Name System (DNS). You really want them on. 9 [6] Because SunOS doesn't support Path MTU Discovery (RFC1191), this may result in some packets being fragmented by routers just before low-MTU links. IMHO, folks maintaining links with MTUs beneath 1500 deserve to lose, so this result is just fine. 9 [7] note. this whole section needs reformatting - 6 - *************** *** 153,165 **** * Default TCP Maximum Segment Size - 512 to be conservative, * Higher for high-performance routers */ ! int tcp_default_mss = 512; /* * Default TCP buffer sizes (in bytes) */ ! int tcp_sendspace = 1024*4; ! int tcp_recvspace = 1024*4; /* * size of "keep alive" probes. --- 153,174 ---- * Default TCP Maximum Segment Size - 512 to be conservative, * Higher for high-performance routers */ ! #ifndef TCPDEFAULTMSS ! #define TCPDEFAULTMSS 512 ! #endif ! int tcp_default_mss = TCPDEFAULTMSS; /* * Default TCP buffer sizes (in bytes) */ ! #ifndef TCPSENDSPACE ! #define TCPSENDSPACE (1024*4) ! #endif ! #ifndef TCPRECVSPACE ! #define TCPRECVSPACE (1024*4) ! #endif ! int tcp_sendspace = TCPSENDSPACE; ! int tcp_recvspace = TCPRECVSPACE; /* * size of "keep alive" probes. - 7 - *************** *** 170,176 **** int tcp_keepidle = TCPTV_KEEP_IDLE; /* for Keep-alives */ int tcp_keepintvl = TCPTV_KEEPINTVL; ! int udp_cksum = 0; /* turn on to check & generate udp checksums */ int udp_ttl = 60; /* default time to live for UDPs */ /* --- 179,188 ---- int tcp_keepidle = TCPTV_KEEP_IDLE; /* for Keep-alives */ int tcp_keepintvl = TCPTV_KEEPINTVL; ! #ifndef UDPCKSUM ! #define UDPCKSUM 0 ! #endif ! int udp_cksum = UDPCKSUM; /* turn on to check & generate udp checksums */ int udp_ttl = 60; /* default time to live for UDPs */ /* _5. _I_n_t_e_r_a_c_t_i_v_e_l_y _d_e_b_u_g_g_i_n_g _a _k_e_r_n_e_l kadb, kgdb, ddb, symmon _5._1. _U_s_i_n_g _k_a_d_b _k_a_d_b is the Sun kernel debugger. It may be booted with bbbbooooooootttt kkkkaaaaddddbbbb at which point it the boot loader loads the kernel debugger, and the kernel debugger then loads the SunOS or Solaris kernel. kadb accepts standard _a_d_b commands, including $$$$cccc to continue (resume exection of the OS) and $$$$qqqq to quit (exit to the ROM monitor). _6. _C_o_n_f_i_g_u_r_i_n_g _k_e_r_n_e_l _c_r_a_s_h _d_u_m_p_s _7. _F_o_r_c_i_n_g _a _k_e_r_n_e_l _c_r_a_s_h _d_u_m_p On an OpenBOOT SPARC: 0 set-pc go _8. _D_e_b_u_g_g_i_n_g _a _k_e_r_n_e_l _c_r_a_s_h _d_u_m_p _8._1. _U_s_i_n_g _a_d_b _o_r _d_b_x - 8 - _8._2. _U_s_i_n_g _c_r_a_s_h _9. _P_a_t_c_h_i_n_g _t_h_e _k_e_r_n_e_l _9._1. _P_a_t_c_h_i_n_g _v_a_r_i_a_b_l_e_s _9._2. _P_a_t_c_h_i_n_g _c_o_d_e _9._2._1. _P_a_t_c_h_i_n_g _t_h_e _r_u_n_n_i_n_g _k_e_r_n_e_l _D_u_n_g_e_o_n would ask you, "Do you wish me to try to patch you?", and if you said "No", it would report: What? You don't trust me? Why, only last week I patched a running RSX system and it survived for over thirty seconds. Oh, well. _1_0. _A_c_k_n_o_w_l_e_d_g_e_m_e_n_t_s Without the inspiration of _I_n_e_s_s_e_n_t_i_a_l '_r_o_f_f this docu- ment would never have been approached. _1_1. _P_o_s_s_i_b_l_e _a_p_p_e_n_d_e_x _A? ps, kill, etc. /afs/sipb/user/jhawk/src/openprom _1_2. _P_o_s_s_i_b_l_e _a_p_p_e_n_d_i_x _B? /afs/sipb/user/jhawk/src/sunostos _1_3. _R_e_f_e_r_e_n_c_e_s