make-sql-cmds
, a shell script that creates a
series of tables in a relational database. The tables contain data describing
the code for version 2.7b of the Mosaic World-Wide Web browser. This script is
not specific to Mosaic; it may be used with any code for which the proper data
files exist.
The script accepts one of the four arguments "globals", "symbols", "calls", or "procs". Each argument produces one table: the globals table contains information about global variable definitions; the symbols table contains information about every symbol appearing in the code; the calls table contains information about each procedure call appearing in the code; and the procs table contains information abut the procedure definitions appearing in the code.
make-sql-cmds
does not use the source-code files to generate the tables;
instead, it uses the data files created for Sun's source-code browsing tool.
The browsing data files are generated by giving the -xsb
option to Sun's C
compiler; this creates a binary-encoded file containing source-code
information. Because the binary format is hard to deal with, each binary file
is run through the sbdump
command, which produces an ASCII text version of
the binary file; it is these text files that make-sql-cmds
uses to generate
the tables.
make-sql-cmds
produces as output (to std-out
) a series of SQL commands;
these commands, when fed to a relational database, create the source-code
tables. make-sql-cmds
does no source code analysis itself; it merely
produces the tables used for source-code analysis.
make-sql-cmds
can generate four tables; the
choice of tables is specified by a command-line argument captured in the shell
variable what
. The actual table generation is delegated to a shell
function containing an AWK script.
<generate the table>= (U->) case $what in globals) make_global_defs ;; calls) make_call_info ;; symbols) make_symbol_refs ;; procs) make_proc_refs ;; *) echo \"$what\" is an unknown choice. Choices are \"globals\", \ \"symbols\", \"calls\", or \"procs\". 1>&2 exit 1 ;; esac
make_call_info()
shell function generates a table of who-calls-whom
information.
<make_call_info()
shell function>= (U->)
function make_call_info {
gawk '
<create the call-info table>
<get the file and directory names>
<get the calling information>
'
}
make_call_info()
first outputs the SQL command to create the table that
will hold the procedure call information. The table is called call_info
and has the following fields:
file
- The name of the file containing the call.
dir
- The name of the directory holding the file.
caller
- The name of the calling procedure.
called
- The name of the called procedure.
lno
- The line number within file
at which the call occurred.
<create the call-info table>= (<-U U->) BEGIN { table_name = "call_info" printf "create table %s (", table_name printf "dir varchar(80) not null," printf "file varchar(40) not null," printf "caller varchar(40) not null," printf "calling varchar(40) not null," printf "lno int unsigned not null" printf ");\n" }
The name of the file and containing directory are given in the Source name
section of the sbdump
file. The next line after the section header is the
full path name of the file. This line gets split into pieces delimited by the
directory separator /
, and the pieces are reassembled into file
, the
file name, and dir
the full path leading to file
.
<get the file and directory names>= (<-U U-> U-> U->) /.... Source name section/ { getline i = split($2, names, "/") file = names[i] dir = "" for (j = 2; j < i; j++) dir = dir "/" names[j] }
The calling information in an sbdump
file is contained in lines that begin
with a number and the words Call From
. The forth field is the caller's
name, the sixth field is the called procedure's name, and the eight field is
the line number at which the call occurred. If a procedure is called from
within a macro, the associated line number is negative; in that case it's
negated to make it positive.
Strings in SQL are delimited by single quotes (they may also be delimited by
double quotes). Including an actual single quote in the code would screw up
shell quoting (ending it prematurely); using \047
, the octal value of a
single quote, avoids the problem.
<get the calling information>= (<-U) /[0-9]*: Call From:/ { if ($8 < 0) $8 = -$8 printf "insert into %s values (", table_name printf "\047%s\047, ", dir printf "\047%s\047, ", file printf "\047%s\047, ", $4 printf "\047%s\047, ", $6 printf "%d", $8 printf ");\n" }
make_global_defs()
shell function generates a table of information on
global variable definitions.
<make_global_defs()
shell function>= (U->)
function make_global_defs {
gawk '
<create the call-info table>
<get the file and directory names>
<get the global definition information>
'
}
make_global_defs()
first outputs the SQL command to create the table that
will hold the procedure call information. The table is called global_defs
and has the following fields:
file
- The name of the file containing the call.
dir
- The name of the directory holding the file.
name
- The name of the global variable.
lno
- The line number within file
at which the global is defined.
<create the global-defs table>= BEGIN { table_name = "global_defs" printf "create table %s (", table_name printf "dir varchar(80) not null," printf "file varchar(40) not null," printf "name varchar(40) not null," printf "lno int unsigned not null" printf ");\n" }
The global-definition information in an sbdump
file is contained in lines
that begin with a number and the words Symbol on
.
<get the global definition information>= (<-U) /[0-9]*: Symbol on/ { i = split($7, parts, "_") if ((i >= 5) && (parts[3] == "def") && (parts[4] == "var") && (parts[5] == "global")) { printf "insert into %s values (", table_name printf "\047%s\047, ", dir printf "\047%s\047, ", file printf "\047%s\047, ", substr($6, 2, length($6) - 2) printf "%d", substr($5, 1, length($5) - 1) printf ");\n" } }
make_symbol_defs()
shell function generates a table of information on
symbol references.
<make_symbol_refs()
shell function>= (U->)
function make_symbol_refs {
gawk '
<create the symbol-reference table>
<get the file and directory names>
<get the symbol reference information>
<report error lines>
'
}
First output the SQL command to create the table that will hold the symbol
reference information. The table is called symbols
and has the following
fields:
file
- The name of the file containing the call.
dir
- The name of the directory holding the file.
name
- The name of the referenced symbol.
type
- The reference type.
lno
- The line number within file
at which the symbol reference
occurred.
<create the symbol-reference table>= (<-U) BEGIN { relname = "symbols" printf "create table %s (", relname printf "dir varchar(80) not null, " printf "file varchar(40) not null, " printf "lno int unsigned not null, " printf "name varchar(40) not null, " printf "type char(40) not null" printf ");\n" }
The symbol-reference information in an sbdump
file is contained in lines
that begin with a number and the words Symbol on
.
<get the symbol reference information>= (<-U) /[0-9]*: Symbol on/ { type = $(NF - 2) if (!match(type, "^cb_")) { loc = dir "/" file if (length(loc) > 55) loc = "[...]" substr(loc, length(loc) - 50) printf "Bad symbol line from %s:\n \"%s\"\n", loc, $0 > "/dev/stderr" } else { symbol = $0 sub("^[^\047]*\047", "", symbol) sub("\047[^\047]*$", "", symbol) if (match(symbol, "\047")) bad_lines++ else { printf "insert into %s values (", relname printf "\047%s\047, ", dir printf "\047%s\047, ", file printf "%d, ", substr($5, 1, length($5) - 1) printf "\047%s\047, ", symbol gsub("_", " ", type) printf "\047%s\047", type printf ");\n" } } }
If any symbol-reference lines were dropped because of quoting problems, print a
message to std-err
indicating so.
<report error lines>= (<-U) END { if (bad_lines > 0) printf "Dropped %d lines with symbols containing single quotes.\n", bad_lines > "/dev/stderr" }
make_proc_refs()
shell function generates a table of information on
symbol references.
<make_proc_refs()
shell function>= (U->)
function make_proc_refs {
gawk '
<create the procedure-reference table>
<get the file and directory names>
<get the procedure reference information>
'
}
First output the SQL command to create the table that will hold the procedure
reference information. The table is called proc_refs
and has the following
fields:
file
- The name of the file referencing the procedure.
dir
- The name of the directory holding file
.
name
- The name of the procedure being referenced.
start
, end
- The first and last lines of name
's definition
in file
; if name
is being called but not defined, start
and end
are zero.
<create the procedure-reference table>= (<-U) BEGIN { relname = "proc_refs" printf "create table %s (", relname printf "dir varchar(80) not null, " printf "file varchar(40) not null, " printf "proc varchar(40) not null, " printf "start int unsigned not null, " printf "end int unsigned not null" printf ");\n" }
The procedure-reference information in an sbdump
file is contained in lines
that begin with a number and the words Function name
.
<get the procedure reference information>= (<-U) /[0-9]*: Function name:/ { printf "insert into %s values (", relname printf "\047%s\047, ", dir printf "\047%s\047, ", file printf "\047%s\047, ", substr($4, 1, length($4) - 1) if (split($7, lnos, "[{},]") != 4) { print "Line-number parsing error." > "/dev/stderr" exit } printf "%s, %s);\n", lnos[2], lnos[3] }
<make-sql-cmds>= #!/bin/ksh <shell boilerplate> <make_call_info()
shell function> <make_global_defs()
shell function> <make_symbol_refs()
shell function> <make_proc_refs()
shell function> <process the command-line options> <generate the file data> | <generate the table> | <clean the data>
There must one command-line argument giving the type of table to be generated;
Any other arguments are interpreted as the names of sbdump
files.
<process the command-line options>= (<-U) [ $# -lt 1 ] && badcmd files='' what='' while [ $# -gt 0 ] do case $1 in globals|symbols|procs|calls) [ "$what" ] && badcmd what=$1 ;; *) files="$files $1" ;; esac shift 1 done [ ! "$what" ] && badcmd
If a set of sbdump
files were given on the command line, extract data from
only those files; otherwise go through the whole subdirectory and extract data
from any sbdump
file found.
<generate the file data>= (<-U) cd /net/projects/groups/morale/src/mosaic-2.7b if [ "$files" ] then for f in $files ; do cat $f ; done else for f in `find . -name '*bdd' -print` ; do cat $f ; done fi
The full paths for files are large, which leads to larger databases and crampled and mal-formatted output. Stripping off the maximal common prefix for full path names makes things better.
<clean the data>= (<-U) sed "s;$PWD/;;g"
<shell boilerplate>= (<-U) pgmname=`basename $0` function oops { # Print $1 to std-err and die. echo 1>&2 "$1." exit 1 } function badcmd { # Print a bad command message and die. echo 1>&2 "Command format is" oops " \"$pgmname [fname]... globals | symbols | procs | calls\"" }
make_call_info()
shell function>: D1, U2
make_global_defs()
shell function>: D1, U2
make_proc_refs()
shell function>: D1, U2
make_symbol_refs()
shell function>: D1, U2