Documentation for the BOIL-language ----------------------------------- (C) netEstate GmbH, www.netestate.de Please call the interpreter 'boiler' without options to review all options. BOIL (Brunnis Own Interpreter Language) is very much like C. We will only list the differences: 1. Datatypes --------------------------------------------------------------------------- Only the datatypes long,unsigned long,double,string and void are possible (especially no pointers or other derivated datatypes). long, unsigned long and double are always initialised with 0 and strings will be undefined (like (char*) 0 in C). Assignments, function-arguments and results with undefined strings are possible but not expressions. The library-function defined() returns 1 if a string is defined. 2. Conversions and casting --------------------------------------------------------------------------- Automatic conversions between the datatypes are made in the following cases: -between long and unsigned long -from long and unsigned long to double If you make expressions or call functions, you should pay attention to what types are allowed, if there are implicit conversions or if you will need an explicit conversion (cast) with possible data loss. You can cast between all types. Example: printf((string) 2.1); 3. Missing Operators --------------------------------------------------------------------------- -All Assignments exept '=' and '+=' Attention: The result of += has type void for performance reasons -',' -Conditional operator '?:' -sizeof -'+' with a single argument Use of a+=b instead of a=a+b with strings is recommended for performance reasons. 4. Preprocessor-directives --------------------------------------------------------------------------- The only directive is '#include file', where file must not be enclosed in '<', '>' or '"'. If you use #include, the interpreter must be called with the option '-r number' where number is the maximum 'recursion depth' of the parser when evaluating #include statements (default is 0 -> no #include possible). 5. Overall program structure --------------------------------------------------------------------------- You do not have to define a function main(). Every block of statements is a valid program. Example: { printf("Hello World\n"); } If you embed code in ASCII/HTML (option -h), it should look like this: Content-Type: text/html Test or use <% and %>: Content-Type: text/html Test <% printf("Hello"); %> <% printf("World"); %> This will apparently be called as CGI-Skript so that the first line must be #!/path/to/boiler -hi The option -i causes the interpreter 'boiler' to ignore this first line. 6. Referencing variables and functions --------------------------------------------------------------------------- Variables and functions can be defined everywhere. They are valid in the statement block of the defining statement. You can reference variables or functions with their names or with expressions of the form [ string-expression ] where the result of string-expression is the name of the variable or function. Example: { string s="printf"; [s]("Hello World\n"); s="Hello World\n"; string [s]=s; printf([s]); void [s](string a) { printf(a); return; }; [s](s); s="exit"; [s](); } If you define a variable or function it will be valid only in the actual statement-block which is not always intented. If you reference something with a string-expression you can also add a second long-expression, which stands for the number of statement-blocks to go up before defining or searching a variable/function. '0' stands for the actual block, '1' for the upper block, '2' for the second upper block etc. A negative number stands for the 'start block'. The variables and functions defined in the start block can be viewed as 'global'. The number -2 stands for the 'system block', that can also be used by isolated program parts. All library-functions and system-variables like error and perror are defined in the system block at startup. The call of { string s="startblock\n"; { string s="subblock 1\n"; { string s="subblock 2\n"; printf(s+["s",1]+["s",2]+["s",-1]); } } } will generate the output: subblock 2 subblock 1 startblock startblock 7. Accounting --------------------------------------------------------------------------- There are four accounting types: 0 Internal recursions 1 Variables 2 Functions 3 Library function calls The internal recursions are not recursions of the BOIL-program, they are the many little subfunction-calls the interpreter has to make to execute a statement. If the Interpreter calls a subfunction, adds a variable, a function or calls a library function, it increments the accounting counters of the corresponding accounting types. If a subfunction terminates, a variable or function is deleted or a library function terminates, the accounting counters will be decremented. Lets demontrate this with a little program (examples/accounting): { for (unsigned long i=0;i!=10;i++) { printf("."); } printf("\n"); } If it is called with boiler -z on -s -a 0:0:0:0 -a 1:0:0:0 -a 2:0:0:0 -a 3:0:0:0 accounting we get the output: .......... accounting-stats: ----------------- type 0: sum: 169 (0 max) actual: 0 (0 max) time: 0.017135 (0 max) type 1: sum: 1 (0 max) actual: 0 (0 max) time: 0.016029 (0 max) type 2: sum: 0 (0 max) actual: 0 (0 max) time: 0.000000 (0 max) type 3: sum: 11 (0 max) actual: 0 (0 max) time: 0.001578 (0 max) This means: -The Interpreter made 169 subfunction-calls in 0.017 sec to execute the program -One variable was defined -Zero functions were defined -Eleven calls of printf were made in 0.001 sec It has no sense to set time-limits for variables and functions but it has for library calls and internal recursions. Actual Limits for library-function calls have no sense because the interpreter will be in execution of no more than one at every time. The difference between limiting accumulated instances and actual instances is simple: If you limit accumulated instances of internal recursions, you limit the "runtime" of the program. If you limit actual instances, you limit the recursion depth of the program. 8. New options for function blocks --------------------------------------------------------------------------- In every function-definition you can put a string-expression before the list of arguments, which contains command-line options of the interpreter: void print("-z off",string s) { printf(s); return; } In this example the calls of print will have no time-Accounting. The new options are valid while the interpreter executes the function. The following Rules apply: -If you use the option -a, the function has a completly new accounting that will be reset on every function call (exeption: recursion) -If you use the option -x, all previously disabled functions can be called again. You cannot use the options -m, -d or -s in a function-definition (only at the command line). Some options are possible, that do not exist on the command line: -p Specify a password for the external call of the function. -l Deletes all handles for files,database-connections,etc. so that the function cannot use them. -c type:handle rescues a handle (after -l). Possible types: 1 Streams 2 Postgres-Connections 3 Postgres-Querys 4 RPC-Server 5 RPC-Client 6 Sockets 7 Directorys 8 msql-connections 9 msql-querys 10 mysql-connections 11 mysql-querys -f Flags The Flags regulate the permissions of the function when objects of the upper blocks are referenced (the system block is NOT an upper block). 1 Function-calls are OK 2 Variables can be referenced (read AND write) 4 Variables can be declared if Flags is 0, the function is completely isolated from the upper blocks (exept arguments and result of course). If -l was not used, the handles of the upper blocks can be used by an isolated function. Restrictions can be made invalid by defining a function with an option-string. So you should use '-o off' (option strings off) if you define a function with disabled functions or accounting-limits, etc. 9. Parsing code at runtime --------------------------------------------------------------------------- In every function-definition you can use a statement of the form eval(code) instead of a statement block. code is a string-expression and must be a correct BOIL-program. Parsing options can be adjusted with an option-string in the function-definition. Example: void print(string a) eval("{ printf(a); return; }") 10. Runtime errors in function calls --------------------------------------------------------------------------- Runtime errors or calls of exit() in function calls normally lead to the termination of the program. This is not always intended, so you can prevent it by putting the name of the function in brackets '(' ')' when calling it. This is also the only way to get accounting-information for a function with separate accounting. A function-call of the form (function)(args) Will set the following variables in the actual statement-block: unsigned long func_status Status 1=OK 2=continue-statement 3=break-statement 4=return-statement 5=exit-statement or runtime-error string func_error Runtime-error unsigned long func_error_line Sourcecode line of runtime-error string func_error_file Sourcecode file of runtime-error For every accounting-type there will be a variable: string func_acc_ X1:X2:X3 X1=accumulated instances X2=actual instances X3=accumulated runtime 11 External functions --------------------------------------------------------------------------- Distributed programming is realised with Remote Procedure Calls (RPC). The call of a function can lead to execution of a statement block in the same program or - completely transparent - to a RPC on a distant BOIL-Server where the function is defined. On the client-side, you must define external functions. Instead of a statement block write: external(Client-Handle,Function name,Password) e.g.: void print external(clnt,"print","secret") Client-handles are created and destroyed with the library-functions clnt_create(host,rpc-program-number,version,protocol) and clnt_destroy(handle). On the server side, you must register one or more RPC-servers with the function svc_register(rpc-program-number,version,protocol,fork,socket). Requests are accepted with the function svc_run(timeout_sek,timeout_usek). If fork==1 the Server will fork a child-process before executing a request so that it can accept other requests while the first one is processed. Drawback: Modifications of the data-structures of the server programm will be lost when the child exits. ATTENTION: svc_run can handle more than one request at one time. If a global function fork_handler(long pid) is defined, it will be called after every generation of a child process with it's PID. You can only call functions of the server that have a password (specify with -p in an option-string). With svc_unregister(handle) you can unregister the RPC-Servers. The end of a child-process is normally ignored by the server. This can be changed with sigcld(1), so that the end of every child must be noticed with the function long waitpid() (if not, you will have zombies). The directory examples/rpc contains an example of server and clients. 12 Library-functions --------------------------------------------------------------------------- If there is an error, EOF, etc. in a library-function, the system-variable 'error' will be set to 1 and 'perror' will contain the error-message. A cast from string to long or double can also set error and perror. You should watch for the value of error when calling Library-functions that cannot signal an error in the result. Example: The result of strlen() is 0 for an undefined string. The following library-functions are implemented till now (R stand for result): 12.1 Standard functions --------------------------------------------------------------------------- unsigned long printf(string a) R 0 a is empty or Error R Number of written chars. void sleep(unsigned long time) Wait time seconds. string getenv(string) Read an environment variable. R undef Error R Content of variable unsigned long setenv(string name,string value,unsigned long overwrite) Set environment variable. R 0 Error R 1 OK void srandom(unsigned long seed) See 'man srandom'. Default seed is the number of microseconds at the start of the interpreter modulo 65536. long random() See 'man random'. string crypt(string key) Crypt a password. The 'salt' argument that can be found in the libc-function is generated with random(). R undef Error R Crypted password long waitpid() Wait for the end of a child. R -1 Error R 0 No child was terminated R PID of the terminated child void sigcld(unsigned long flag) Change signal-handling for SIGCLD: flag==0 Ignore,default flag==1 Own handler. You must notice the end of the childs with waitpid() long getpid() Get own PID. R -1 Error R PID long system(string cmd) Call a shell-command. R -1 Error R Exit-Code double sqrt(double arg) Get square root. R Result 12.2 Time --------------------------------------------------------------------------- unsigned long time() R Actual time If a variable time_usec of type unsigned long exists, the microseconds part of the actual time will be saved there. unsigned long mktime(unsigned long year, unsigned long month, unsigned long day, unsigned long hour, unsigned long minute, unsigned long second) Convert time-representation. R 0 Error or 1. Jan 1970 00:00:00 R Number of seconds string strftime(string format,unsigned long seconds) Generate a date-string. See 'man strftime' for metacharacters. R undef Error R Date-string 12.3 Files and directorys --------------------------------------------------------------------------- long filesize(string path) R -1 File not found or size > 2GB R File size unsigned long fileatime(string path) Get atime of a file. R 0 Error R atime unsigned long filemtime(string path) Get mtime of a file. R 0 Error R mtime unsigned long unlink(string path) Remove file. R 0 Error R 1 OK string tmpnam() Generate a unique name for a temporary file. R undef Error R Filename long opendir(string path) Open a directory. R -1 Error R Handle string readdir(long handle) Get next filename ("." and ".." will also be returned). R undef Error R Filename unsigned long closedir(long handle) Close a directory. R 0 Error R 1 OK unsigned long mkdir(string path) Make a directory. R 0 Error R 1 OK unsigned long rmdir(string path) Remove a directory. R 0 Error R 1 OK unsigned long rename(string oldpath,string newpath) Rename a file. R 0 Error R 1 OK 12.4 Functions for CGI-Skripts --------------------------------------------------------------------------- unsigned long cgiwrap(string path,string postdata,unsigned long head) Special purpose function for netEstate. Will output a document with content-type or start a CGI-Skript, if it has the extension .cgi: gif image/gif jpg image/jpeg jpeg image/jpeg jpe image/jpeg html text/html htm text/html cgi Content-type will be specified by the called Skript * application/octet-stream If head==1, the content-type will not be specified. If the file has no extentions and is a directory, cgiwrap() will try to find index.html or index.htm in it. R 0 Error R 1 OK unsigned long getpost() Reads POST-data for a CGI-Skript into variables of the actual block. R 0 Error R 1 OK unsigned long postdata(string postdata) unsigned long postdata_utf8(string postdata) Reads POST/GET-data for a CGI-Skript into variables of the actual block. The data is specified as argument and has to be read from the environment-variable QUERY_STRING or from stdin before. R 0 Error R 1 OK string postdata_utf8tolatin1(string postdata) Recodes POST/GET-data from QUERY_STRING or stdin from UTF8 to latin1. In the data contains invalid UTF8 the original string is returned and error is set. R undefined Error R Result string file_upload(long maxsize) Reads POST-data with encoding type 'multipart/form-data'. Forms with this encoding are used for file-uploads commonly (). You can only transmit one file that is saved with a temporary name (the result of the function). Other data is read into variables of the actual block. Also, a wwwurl-encoded query-string is read into the variable upload_postdata of the actual block. The original Filename is stored as variable 'file'. R undef Error R Filename string htmlenc(string data) HTML-metacharacters are escaped in the format &#%Number; ('<','>','"','\','&'). R undef Error R New string unsigned long phtmlenc(string data) Like htmlenc, but the result will be written with printf(). R 0 Error R 1 OK string fullhtmlenc(string data) Every non alphanumerical char in data will be escaped in the format %Number, which is useful for generating POST- or GET-data in the default encoding 'application/x-www-form-urlencoded'. R undef Error R New string 12.5 Funktions for internal things --------------------------------------------------------------------------- void exit() Immediate Termination. unsigned long setproctitle(string) Sets command name in process list (ps). R 0 Error R 1 OK unsigned long isset(string) Tests if a variable is defined. R 0 Variable not found R 1 Variable found unsigned long isfunc(string) Tests if a function is defined. R 0 Function not found R 1 Function found unsigned long deletefunc(string name,long block) Deletes a function. You cannot delete system-funtions (like all library-functions). R 0 Error R 1 OK unsigned long deletevar(string name,long block) Deletes a variable. You cannot delete system-variables like error and perror. R 0 Error R 1 OK unsigned long parse(string code,string options) Parse the BOIL-program in code. R 0 Parsing error R 1 OK 12.6 Locking --------------------------------------------------------------------------- long do_lock(string file) Make a lockfile named .lck. R -1 Error R 0 Lock present and PID running R 1 OK long do_unlock(string file) Remove a lock. R -1 Error R 1 OK 12.7 Strings --------------------------------------------------------------------------- string base64encode(long stream) R undef Error R Content of stream base64-coded string base64stringencode(string a) R undef Error R Content of a base64-coded long base64decode(string b64,long stream) Decodes the base64-string b64 and writes the result into stream. R 0 Error R 1 OK string md5(string a) R undef Error R MD5-Hash of a unsigned long defined(string a) R 0 String a undefined R 1 String a defined string strstr(string a,string needle) Search needle in a and return it with the rest of a. R undef Error (e.g. needle not found) R Result string strtok(string a,string delim) Tokenize a with the delimiters in delim. See 'man strtok'. Use token() for the second, third, etc. token. R undef Error (e.g. token not found) R First token string token(string delim) Get more tokens. R undef Error (e.g. Token not found) R Token string strtok1(string a,string delim) Like strtok(), but uses the variable with the name a (reentrant version). Instead of token(delim) you always call strtok1(a,delim) here. The content of a is destroyed during tokenizing. string strtok_esc(string a,string delim) Like strtok(), but chars in a that are escaped with '\' are not regarded as delimiters. R undef Error R First token string token_esc(string tokens) R undef Error R Token string strtolower(string a) Convert to lower case. R undef Error R Result string strtoupper(string a) Convert to upper case. R undef Error R Result string substr(string a,unsigned long start,unsigned long length) Get a substring of a. Start begins with 0, length with 1. R undef Error R Result unsigned long strlen(string a) Length of a. R 0 Error oder a empty. R Length string escape(string a,string escapes) Escape all chars in escapes with \ in string a. R undef Error R Result string unescape(string a) Remove all escapes. R undef Error R Result string escape1(string a) Escape CR,LF,HT and FF with "\n","\r","\v" and "\f". R undef Error R Result string unescape1(string a) Unescape "\n","\r","\v" and "\f". R undef Error R Result string textfile(string path) Read a textfile into a string. R undef Error R textfile as string unsigned long getchar(string a,unsigned long pos) Get a char from a string. pos begins with 0. R 0 Error R Char string chr(unsigned long ch) Get the character for 8-bit-codepoint ch. R undef Error R Char string sql_escape(string data) Escape SQL-metacharacters (''','"','\') in data with '\'. R undef Error R Result string str_repl(string source,string search,string replace) replace replace for search in source (1x). R undef Error R Result string str_repla(string source,string search,string replace) replace replace for search in source (all occurences). R undef Error R Result string latin1toutf8(string source) Convert source from Latin1 to UTF8. R undef Error R Result string utf8tolatin1(string source,unsigned long mode) Convert source from UTF8 to Latin1. mode = 0 -> Generate error if character cannot be represented mode = 1 -> Generate "?" if character cannot be represented mode = 2 -> Generate HTML-Charref (e.g. α) if character cannot be represented R undef Error R Result 12.8 Streams --------------------------------------------------------------------------- long fopen(string filename,string mode) Open filename and return handle. "STDIN" and "STDOUT" can be used as special filenames. R -1 Error R Handle long popen(string command,string mode) Execute shell-comand and return handle to read/write from/to it's stdout/stdin. R -1 Error R Handle unsigned long exec(string path,string argname, string inname,string outname) Execute 'path' with exec(). Command line arguments are specified in variables with the name argname_counter. argname is the basename, counter starts with 0 and stops when the corresponding string-variable is undefined. inname and outname are the names of the long-variables where the handles to read/write from/to stdout/stdin of the command are stored. Example: string arg_0="sendmail",arg_1="-bm",arg_2=bm,arg_3; long in,out; exec("/usr/sbin/sendmail","arg_","in","out"); R 0 Error R 1 OK long feof(long handle) R -1 Error R 0 No EOF in stream R 1 EOF string fgets(long handle) R undef Error or EOF R Line from stream unsigned long fputs(long handle,string data) R 0 Error R 1 OK unsigned long fclose(long handle) R 0 Error R 1 OK long pclose(long handle) R -1 Error R Exit-Code unsigned long fflush(long handle) R 0 Error R 1 OK unsigned long setoutstream(long handle) The output-stream (printf,?> = SHIFTLEFT << SHIFTRIGHT >> PLUSPLUS ++ MINUSMINUS -- PLUSEQUAL += Terminal tokens are CONST (Constant), NAME (Reference) and PRINT (Text output when embedding code into ASCII/HTML). Nonterminal tokens: type: LONG | UNSIGNED LONG | DOUBLE | STRING | VOID reference: NAME | '[' expr ']' | '[' expr ',' expr ']' expr_list: expr | expr_list ',' expr function_call: reference '(' expr_list ')' | reference '(' ')' | '(' reference ')' '(' expr_list ')' | '(' reference ')' '(' ')' expr: CONST | reference | function_call | reference '=' expr | reference PLUSEQUAL expr | reference PLUSPLUS | reference MINUSMINUS | PLUSPLUS reference | MINUSMINUS reference | '(' type ')' expr | '(' expr ')' | '-' expr | '~' expr | '!' expr | expr '?' expr ':' expr | expr '*' expr | expr '/' expr | expr '%' expr | expr '+' expr | expr '-' expr | expr SHIFTLEFT expr | expr SHIFTRIGHT expr | expr '<' expr | expr '>' expr | expr LE expr | expr GE expr | expr EQUAL expr | expr NOTEQUAL expr | expr '&' expr | expr '^' expr | expr '|' expr | expr AND expr | expr OR expr arg_list_entry: type NAME arg_list: arg_list_entry | arg_list ',' arg_list_entry function_dekl: type reference external_stmt | type reference '(' arg_list ')' block | type reference '(' ')' block | type reference '(' arg_list ')' eval_stmt | type reference '(' ')' eval_stmt | type reference '(' expr ',' arg_list ')' block | type reference '(' expr ')' block | type reference '(' expr ',' arg_list ')' eval_stmt | type reference '(' expr ')' eval_stmt dekl_list_entry: reference | reference '=' expr dekl_list: dekl_list_entry | dekl_list ',' dekl_list_entry variable_dekl: type dekl_list ';' expr_stmt: ';' | expr ';' choice_stmt: IF '(' expr ')' stmt | IF '(' expr ')' stmt ELSE stmt loop_stmt: WHILE '(' expr ')' stmt | DO stmt WHILE '(' expr ')' ';' | FOR '(' stmt expr ';' expr ')' stmt | FOR '(' stmt ';' expr ')' stmt | FOR '(' stmt expr ';' ')' stmt | FOR '(' stmt ';' ')' stmt jump_stmt: BREAK ';' | CONTINUE ';' | RETURN ';' | RETURN expr ';' eval_stmt: | EVAL '(' expr ')' external_stmt: EXTERNAL '(' expr ',' expr ',' expr ')' stmt: variable_dekl | function_dekl | expr_stmt | block | choice_stmt | loop_stmt | jump_stmt | PRINT stmt_list: stmt | stmt_list stmt block: '{' '}' | '{' stmt_list '}'