%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%									%
%	Copyright (C) 1992, 1993 Michael K. Johnson,			%
%	johnsonm@SunSITE.unc.edu					%
%									%
%	This file is freely copyable, but you must preserve this	%
%	copyright notice on all copies, it must only be distributed	%
%	as part of the Linux Kernel Hackers' Guide, and its use is	%
%	is subject to the conditions expressed in the copyright for	%
%	the whole guide, in the file prelim/copyright.tex		%
%									%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%


\section{Supporting Functions}\label{sec-dev-funcs}

Here is a list of many of the most common supporting functions
available to the device driver writer:

{\bf [Please fill in any missing information, especially Defined in: parts]}

\begin{dispitems}
\item[{\tt add\_request()}]\begin{verbatim}
static void add_request(struct blk_dev_struct *dev,
                        struct request * req)\end{verbatim}

This is a static function in ll\_rw\_block.c, and cannot be called by
other code.  However, an understnading of this function, as well as an
understanding of {\tt ll\_rw\_block()}, may help you understand the
strategy routine.

If the device that the request is for has an empty request queue, the
request is put on the queue and the strategy routine is called.
Otherwise, the proper place in the queue is chosen and the request is
inserted in the queue, maintaining proper order by insertion sort.

Proper order (the elevator algorithm) is defined as:
\begin{description}
\item[a.] Reads come before writes.
\item[b.] Lower minor numbers come before higher minor numbers.
\item[c.] Lower block numbers come before higher block numbers.
\end{description}

{\bf Defined in:} kernel/blk\_drv/ll\_rw\_block.c\\
{\bf See also:} {\tt make\_request()}, {\tt ll\_rw\_block()}.


\item [{\tt add\_timer()}]
{\tt void add\_timer(long jiffies, void (*fn)(void))}

{\tt \#include <linux/sched.h>}

Causes a function to be executed when a given amount of time has
passed.  Takes the following arguments:
\begin{dispitems}
\item [{\tt jiffies}] The number of jiffies (100ths of a second) to
time out after.
\item [{\tt fn}] Kernel-space function to run after timeout has occured
\end{dispitems}

{\bf Note:} This is {\em not} process-specific.  Therefore, if you
want to wake a certain process at this timeout, you will have to use
the sleep and wake primitives.  Also, be cautious, as the kernel will
panic if over 64 timers are set.  {\bf [Is this still true?  There was
noise about dynamically allocated timers some time ago.]}

{\bf Defined in:}\\
{\bf See also:}


\item [{\tt cli()}]
{\tt \#define cli() \_\_asm\_\_ \_\_volatile\_\_ ("cli"::)}

{\tt \#include <asm/system.h>}

Prevents interrupts from being acknowledged.  {\tt cli} stands for
``CLear Interrupt enable''.

{\bf See also:} {\tt sti()}


\item[{\tt end\_request()}]
{\tt static void end\_request(int uptodate)}\\

{\tt \#include "blk.h"}

Called when a request has been satisfied or aborted.  Takes one
argument:
\begin{dispitems}
\item [{\tt uptodate}]
If not equal to 0, means that the request has been satisfied.\\
If equal to 0, means that the request has not been satisfied.
\end{dispitems}

If the request was satisfied ({\tt uptodate != 0}), {\tt
end\_request()} maintains the request list, unlocks the buffer, and
may arrange for the scheduler to be run at the next convenient time
({\tt need\_resched = 1}), before waking up all processes sleeping on
the {\tt wait\_for\_request} event, which is slept on in {\tt
make\_request()}, {\tt ll\_rw\_page()}, and {\tt ll\_rw\_swap\_file()}.  

{\bf Note:} This function is a static function, defined in
kernel/blk\_drv/blk.h for every non-scsi device that includes blk.h.
It includes several defines dependent on static device information,
such as the device number.  This is marginally faster than a normal c
function which is more generic.

{\bf Defined in:} kernel/blk\_drv/blk.h\\
{\bf See also:} {\tt ll\_rw\_block()}, {\tt add\_request()}, {\tt
make\_request()}.


\item [{\tt free\_irq()}]
{\tt void free\_irq(unsigned int irq)}

{\tt \#include <linux/sched.h>}

Frees an irq previously aquired with {\tt request\_irq()} or {\tt
irqaction()}.  Takes one argument:
\begin{dispitems}
\item [{\tt irq}] interrupt level to free.
\end{dispitems}

{\bf Defined in:}\\
{\bf See also:} {\tt request\_irq()}, {\tt irqaction()}.


\item [{\tt get\_fs*()}]
{\tt inline unsigned char get\_fs\_byte(const char * addr)}\\
{\tt inline unsigned short get\_fs\_word(const unsigned short * addr)}\\
{\tt inline unsigned long get\_fs\_long(const unsigned long *addr)}

{\tt \#include <asm/segment.h>}

Allows a driver to access data in user space, which is in a different
segment than the kernel.  When entering the kernel through a system
call, a selector for the current user space segment is put in the fs
segment register, thus the names.

{\bf Note:} these functions may cause implicit I/O, if the memory
being accessed has been swapped out, and therefore pre-emption may
occur at this point.  Do not include these functions in critical
sections of your code unless the critical sections are protected by
{\tt cli()}/{\tt sti()} pairs.

These functions take one argument:
\begin{dispitems}
\item [{\tt addr}] Address to get data from.
\end{dispitems}
\begin{dispitems}
\item [{\bf Returns:}] Data at that offset in the fs segment.
\end{dispitems}

{\bf Defined in:}\\
{\bf See also:} {\tt memcpy\_*fs()}, {\tt put\_fs*()}, {\tt cli()},
{\tt sti()}.


\item [{\tt inb(), inb\_p()}]
{\tt inline unsigned int inb(unsigned short port)}\\
{\tt inline unsigned int inb\_p(unsigned short port)}

{\tt \#include <asm/io.h>}

Reads a byte from a port.  {\tt inb()} goes as fast as it can, while
{\tt inb\_p()} pauses before returning.  Some devices are happier if
you don't read from them as fast as possible.  Both functions take one
argument:
\begin{dispitems}
\item [{\tt port}] Port to read byte from.
\end{dispitems}
\begin{dispitems}
\item [{\bf Returns:}] The byte is returned in the low byte of the
32-bit integer, and the 3 high bytes are unused, and may be garbage.
{\bf [Or are they cleared?  Find this out.]}
\end{dispitems}

{\bf Defined in:}\\
{\bf See also:} {\tt outb()}, {\tt outb\_p()}.


\item [{\tt irqaction()}]
{\tt int irqaction(unsigned int irq, struct sigaction *new)}

{\tt \#include <linux/sched.h>}

Hardware interrupts are really a lot like signals.  Therefore, it
makes sense to be able to register an interrupt like a signal.  The
{\tt sa\_restorer()} field of the {\tt struct sigaction} is not used,
but otherwise it is the same.  The int argument to the {\tt
sa.handler()} function may mean different things, depending on whether
or not the IRQ is installed with the {\tt SA\_INTERRUPT} flag.  If it
is not installed with the {\tt SA\_INTERRUPT} flag, then the argument
passed to the handler is a pointer to a register structure, and if it
is installed with the {\tt SA\_INTERRUPT} flag, then the argument
passed is the number of the IRQ.  For an example of handler set to use
the {\tt SA\_INTERRUPT} flag, look at how {\tt rs\_interrupt()} is
installed in \dots/kernel/chr\_drv/serial.c

The {\tt SA\_INTERRUPT} flag is used to determine whether or not the
interrupt should be a ``fast'' interrupt.  Normally, and for all
interrupts installed by {\tt request\_irq()}, upon return from the
interrupt, {\tt need\_resched}, a global flag, is checked.  If it is
set ($\not= 0$), then {\tt schedule()} is run, which may schedule
another process to run.  They are also run with all other interrupts
still enabled.  However, by setting the {\tt sigaction} structure
member {\tt sa\_flags} to {\tt SA\_INTERRUPT}.

{\tt irqaction()} takes two arguments:
\begin{dispitems}
\item [{\tt irq}] The number of the IRQ the driver wishes to acquire.
\item [{\tt new}] A pointer to a sigaction struct.
\item [{\bf Returns:}] {\tt -EBUSY} if the interrupt has already been
acquired,\\
{\tt -EINVAL} if {\tt sa.handler()} is NULL,\\
0 on success.
\end{dispitems}

{\bf Defined in:}\\
{\bf See also:} {\tt request\_irq(), free\_irq()}


\item[{\tt IS\_*(inode)}]
{\tt IS\_RDONLY(inode) ((inode)->i\_flags \& MS\_RDONLY)}\\
{\tt IS\_NOSUID(inode) ((inode)->i\_flags \& MS\_NOSUID)}\\
{\tt IS\_NODEV(inode) ((inode)->i\_flags \& MS\_NODEV)}\\
{\tt IS\_NOEXEC(inode) ((inode)->i\_flags \& MS\_NOEXEC)}\\
{\tt IS\_SYNC(inode) ((inode)->i\_flags \& MS\_SYNC)}

{\tt \#include <linux/fs.h>}

These five test to see if the inode is on a filesystem mounted the
corresponding flag.


\item [{\tt kfree*()}]
{\tt \#define kfree(x) kfree\_s((x), 0)}\\
{\tt void kfree\_s(void * obj, int size)}

{\tt \#include <linux/kernel.h>}

Free memory previously allocated with {\tt kmalloc()}.  There are two
possible arguments:
\begin{dispitems}
\item [{\tt obj}] Pointer to kernel memory to free.
\item [{\tt size}] To speed this up, if you know the size, use {\tt
kfree\_s()} and provide the correct size.  This way, the kernel memory
allocator knows which bucket cache the object belongs to, and doesn't
have to search all of the buckets.  (For more details on this
terminology, read \dots/lib/malloc.c.
\end{dispitems}

{\bf Defined in:}\\
{\bf See also:} {\tt kmalloc()}.


\item [{\tt kmalloc()}]
{\tt void * kmalloc(unsigned int len, int priority)}

{\tt \#include <linux/kernel.h>}

Allocates a chunk of memory no larger than 4096 bytes, and which will
be allocated in chunks which are powers of two.  {\tt kmalloc()} takes
two arguments:
\begin{dispitems}
\item [{\tt len}] Length of memory to allocate.  Must not exceed 4096,
or the kernel will complain ``kmalloc called with impossibly large
argument''.
\item [{\tt priority}] {\tt GFP\_KERNEL} or {\tt GFP\_ATOMIC}.  If {\tt
GFP\_KERNEL} is chosen, {\tt kmalloc()} may sleep, allowing
pre-emption to occur.  This is the normal
way of calling {\tt kmalloc()}.  However, there are cases where it is
better to return immediately if no pages are available, without
attempting to sleep to find one.  One of the places in which this is
true is in the swapping code, because it could cause race conditions,
and another in the networking code, where things can happen at much
faster speed that things could be handled by swapping to disk to make
space for giving the networking code more memory.  {\bf [Another
reason for using {\tt GFP\_ATOMIC} is if it is being called from an
interrupt, when you cannot sleep.  I have not yet checked to see if
this is the case in either of these examples, but I suspect it might.]}

\item [{\bf Returns:}] {\tt NULL} on failure.\\
Pointer to allocated memory on success.
\end{dispitems}

{\bf Defined in:}\\
{\bf See also:} {\tt kfree()}


\item[{\tt ll\_rw\_block()}]
{\tt void ll\_rw\_block(int rw, int nr, struct buffer\_head *bh[])}

{\tt \#include <linux/fs.h>}

No device driver will ever call this code: it is called only through
the buffer cache.  However, an understanding of this function may help
you understnad the function of the strategy routine.

After sanity checking, if there are no pending requests on the
device's request queue, {\tt ll\_rw\_block()} ``plugs'' the queue so
that the requests don't go out until all the requests are in the
queue, sorted by the elevator algorithm.  {\tt make\_request()} is
then called for each request.  If the queue had to be plugged, then
the strategy routine for that device is not active, and it is called,
{\bf with interrupts disabled.  It is the responsibility of the
strategy routine to re-enable interrupts.}

{\bf Defined in:} kernel/blk\_drv/ll\_rw\_block.c\\
{\bf See also:} {\tt make\_request()}, {\tt add\_request()}.


\item [{\tt MAJOR()}]
{\tt \#define MAJOR(a) (((unsigned)(a))>>8)}

{\tt \#include <linux/fs.h>}

This takes a 16 bit device number and gives the associated major
number by shifting off the minor number.

{\bf See also:} {\tt MINOR()}.


\item[{\tt make\_request()}]
{\tt static void make\_request(int major, int rw, struct buffer\_head *bh)}

This is a static function in ll\_rw\_block.c, and cannot be called by
other code.  However, an understnading of this function, as well as an
understanding of {\tt ll\_rw\_block()}, may help you understand the
strategy routine.

{\tt make\_request()} first checks to see if the request is readahead
or writeahead and the buffer is locked.  If so, it simply ignores the
request and returns.  Otherwise, it locks the buffer and, except for
SCSI devices, checks to make sure that write requests don't fill the
queue, as read requests should take precedence.

If no spaces are available in the queue, and the request is neither
readahead nor writeahead, {\tt make\_request()} sleeps on the even
{\tt wait\_for\_request}, and tries again when woken.  When a space in
the queue is found, the request information is filled in and {\tt
add\_request()} is called to actually add the request to the queue.

{\bf Defined in:} kernel/blk\_drv/ll\_rw\_block.c\\
{\bf See also:} {\tt add\_request()}, {\tt ll\_rw\_block()}.


\item [{\tt MINOR()}]
{\tt \#define MINOR(a) ((a)\&0xff)}

{\tt \#include <linux/fs.h>}

This takes a 16 bit device number and gives the associated minor
number by masking off the major number.

{\bf See also:} {\tt MAJOR()}.


\item [{\tt memcpy\_*fs()}]\begin{verbatim}
inline void memcpy_tofs(void * to, const void * from,
                        unsigned long n)
inline void memcpy_fromfs(void * to, const void * from,
                          unsigned long n)\end{verbatim}

{\tt \#include <asm/segment.h>}

Copies memory between user space and kernel space in chunks larger
than one byte, word, or long.  Be very careful to get the order of the
arguments right!

{\bf Note:} these functions may cause implicit I/O, if the memory
being accessed has been swapped out, and therefore pre-emption may
occur at this point.  Do not include these functions in critical
sections of your code unless the critical sections are protected by
{\tt cli()}/{\tt sti()} pairs.

These take three arguments:
\begin{dispitems}
\item [{\tt to}] Address to copy data to.
\item [{\tt from}] Address to copy data from.
\item [{\tt n}] Number of bytes to copy.
\end{dispitems}

{\bf Defined in:}\\
{\bf See also:} {\tt get\_fs*()}, {\tt put\_fs*()}, {\tt cli()}, {\tt sti()}.


\item [{\tt outb(), outb\_p()}]
{\tt inline void outb(char value, unsigned short port)}\\
{\tt inline void outb\_p(char value, unsigned short port)}

{\tt \#include <asm/io.h>}

Writes a byte to a port.  {\tt outb()} goes as fast as it can, while
{\tt outb\_p()} pauses before returning.  Some devices are happier if
you don't write to them as fast as possible.  Both functions take two
arguments:
\begin{dispitems}
\item [{\tt value}] The byte to write.
\item [{\tt port}] Port to write byte to.
\end{dispitems}

{\bf Defined in:}\\
{\bf See also:} {\tt inb()}, {\tt inb\_p()}.


\item[{\tt printk()}]
{\tt int printk(const char* fmt, ...)}

{\tt \#include <linux/kernel.h>}

{\tt printk()} is a version of {\tt printf()} for the kernel, with
some restrictions.  It cannot handle floats, and has a few other
limitations, which are documented in kernel/vsprintf.c.  It
takes a variable number of arguments:
\begin{dispitems}
\item [{\tt fmt}] Format string, {\tt printf()} style.
\item [{\tt ...}] The rest of the arguments, {\tt printf()} style.
\end{dispitems}
\begin{dispitems}
\item [{\bf Returns:}] Number of bytes written.

{\bf Note:} \blackdiamond {\tt printk()} may cause implicit I/O, if
the memory being accessed has been swapped out, and therefore
pre-emption may occur at this point.  Also, {\tt printk()} will set
the interrupt enable flag, so {\bf never use it in code protected by
{\tt cli()}.}  {\bf [But is it supposed to?]}

\end{dispitems}

{\bf Defined in:} kernel/vsprintf.c, kernel/syslog.c


\item [{\tt put\_fs*()}]
{\tt inline void put\_fs\_byte(char val, char *addr)}\\
{\tt inline void put\_fs\_word(short val, short *addr)}\\
{\tt inline void put\_fs\_long(unsigned long val, unsigned long *addr)}

{\tt \#include <asm/segment.h>}

Allows a driver to write data in user space, which is in a different
segment than the kernel.  When entering the kernel through a system
call, a selector for the current user space segment is put in the fs
segment register, thus the names.

{\bf Note:} these functions may cause implicit I/O, if the memory
being accessed has been swapped out, and therefore pre-emption may
occur at this point.  Do not include these functions in critical
sections of your code unless the critical sections are protected by
{\tt cli()}/{\tt sti()} pairs.

These functions take two arguments:
\begin{dispitems}
\item [{\tt val}] Value to write
\item [{\tt addr}] Address to write data to.
\end{dispitems}

{\bf Defined in:}\\
{\bf See also:} {\tt memcpy\_*fs()}, {\tt get\_fs*()}, {\tt cli()},
{\tt sti()}.


\item [{\tt register\_*dev()}]
\begin{verbatim}
int register_chrdev(unsigned int major, const char *name,
                    struct file_operations *fops)
int register_blkdev(unsigned int major, const char *name,
                    struct file_operations *fops)
\end{verbatim}

{\tt \#include <linux/fs.h>}
{\tt \#include <linux/errno.h>}

Registers a device with the kernel, letting the kernel check to make
sure that no other driver has already grabbed the same major number.
Takes three arguments:
\begin{dispitems}
\item [{\tt major}] Major number of device being registered.
\item [{\tt name}] Unique string identifying driver.  Not currently
used, but it should be in the future.
\item [{\tt fops}] Pointer to a {\tt file\_operations} structure for
that device.
\end{dispitems}
\begin{dispitems}
\item [{\bf Returns:}] {\tt -EINVAL} if major is $\ge$ {\tt MAX\_CHRDEV} or
{\tt MAX\_BLKDEV} (defined in {\tt <linux/fs.h>}), for character or
block devices, respectively.\\
{\tt -EBUSY} if major device number has already been allocated.\\
0 on success.
\end{dispitems}

{\bf Defined in:}\\
{\bf See also:}


\item [{\tt request\_irq()}]
{\tt int request\_irq(unsigned int irq, void (*handler)(int))}

{\tt \#include <linux/sched.h>}\\
{\tt \#include <linux/errno.h>}

Request an IRQ from the kernel, and install an IRQ interrupt handler
if successful.  Takes two arguments:
\begin{dispitems}
\item [{\tt irq}] The IRQ being requested.
\item [{\tt handler}] The handler to be called when the IRQ occurs.
The argument to the handler function will be the number of the IRQ
that it was invoked to handle.
\end{dispitems}
\begin{dispitems}
\item [{\bf Returns:}] {\tt -EINVAL} if {\tt irq} $>$ 15 or {\tt
handler} $=$ {\tt NULL}.\\
{\tt -EBUSY} if {\tt irq} is already allocated.\\
0 on success.
\end{dispitems}

{\bf Defined in:}\\
{\bf See also:} {\tt free\_irq()}.


\item [{\tt select\_wait()}]
\begin{verbatim}
inline void select_wait(struct wait_queue **wait_address,
                        select_table *p)\end{verbatim}

{\tt \#include <linux/sched.h>}

Add a process to the proper {\tt select\_wait} queue.  This function
takes two arguments:
\begin{dispitems}
\item [{\tt wait\_address}] Address of a {\tt wait\_queue} pointer to
add to the circular list of waits.
\item [{\tt p}] If {\tt p} is {\tt NULL}, {\tt select\_wait} does
nothing, otherwise the current process is put to sleep.  This should
be the {\tt select\_table *wait} variable that was passed to your {\tt
select()} function.
\end{dispitems}

{\bf Defined in:}\\
{\bf See also:} {\tt *sleep\_on(), wake\_up*()}


\item[{\tt *sleep\_on()}]
{\tt void sleep\_on(struct wait\_queue ** p)}\\
{\tt void interruptible\_sleep\_on(struct wait\_queue ** p)}

{\tt \#include <linux/sched.h>}

Sleep on an event, putting a {\tt wait\_queue} entry in the list so
that the process can be woken on that event.  {\tt sleep\_on()} goes
into an uninteruptible sleep:  The only way the process can run is to
be woken by {\tt wake\_up()}.  {\tt interruptible\_sleep\_on()} goes
into an interruptible sleep that can be woken by signals and process
timeouts will cause the process to wake up.  A call to {\tt
wake\_up\_interruptible()} is necessary to wake up the process and
allow it to continue running where it left off.  Both take one
argument:
\begin{dispitems}
\item [{\tt p}] Pointer to a proper {\tt wait\_queue} structure
that records the information needed to wake the process.
\end{dispitems}

{\bf Defined in:}\\
{\bf See also:} {\tt select\_wait()}, {\tt wake\_up*()}.


\item [{\tt sti()}]
{\tt \#define sti() \_\_asm\_\_ \_\_volatile\_\_ ("sti"::)}

{\tt \#include <asm/system.h>}

Allows interrupts to be acknowledged.  {\tt sti} stands for
``SeT Interrupt enable''.

{\bf Defined in:}\\
{\bf See also:} {\tt cli()}.


\item[{\tt sys\_get*()}]
{\tt int sys\_getpid(void)}\\
{\tt int sys\_getuid(void)}\\
{\tt int sys\_getgid(void)}\\
{\tt int sys\_geteuid(void)}\\
{\tt int sys\_getegid(void)}\\
{\tt int sys\_getppid(void)}\\
{\tt int sys\_getpgrp(void)}

{\tt \#include <linux/sys.h>}

These system calls may be used to get the information described in the
table below, or the information can be extracted directly from the
process table, like this:\\
{\tt {\sl foo} = current->{\sl pid};}\\
\begin{tabular}{|r|l|}\hline
 {\tt pid} & Process ID\\\hline
 {\tt uid} & User ID\\\hline
 {\tt gid} & Group ID\\\hline
{\tt euid} & Effective user ID\\\hline
{\tt egid} & Effective group ID\\\hline
{\tt ppid} & Process ID of process' parent process\\\hline
{\tt pgid} & Group ID of process' parent process\\\hline
\end{tabular}

{\bf Defined in:}


\item [{\tt wake\_up*()}]
{\tt void wake\_up(struct wait\_queue ** p)}\\
{\tt void wake\_up\_interruptible(struct wait\_queue ** p)}

{\tt \#include <linux/sched.h>}

Wakes up a process that has been put to sleep by the matching {\tt
*sleep\_on()} function.  {\tt wake\_up()} can be used to wake up tasks
in a queue where the tasks may be in a {\tt TASK\_INTERRUPTIBLE} or
{\tt TASK\_UNINTERRUPTIBLE} state, while {\tt
wake\_up\_interruptible()} will only wake up tasks in a {\tt
TASK\_INTERRUPTIBLE} state, and will be insignificantly faster than
{\tt wake\_up()} on queues that have only interruptible tasks.  These
take one argument:
\begin{dispitems}
\item [{\tt q}] Pointer to the {\tt wait\_queue} structure of the
process to be woken.
\end{dispitems}

{\bf Defined in:}\\
{\bf See also:} {\tt select\_wait()}, {\tt *sleep\_on()}

\end{dispitems}