Parsing script files

From Monster Wiki

Jump to: navigation, search


Contents

File names

By convention, script file names should be lowercase, contain no spaces, and end with the extension '.mn'. None of these rules are absolute, but breaking them might it make it impossible for the VM to find class scripts automatically. (However you can easily work around this by loading all the classes explicitly before you use them.) If you are not loading scripts from the file system (eg. if you load them directly from an archive file stream), these rules obviously do not apply.

For class scripts (see below), it's a good idea to give the script file the same name as the class, but in lowercase. So the class MyClass would reside in myclass.mn. For portability reasons it's not recommended to put files which differ only in case (like MyScript.mn and myscript.mn) in the same directory. Some file systems (like Unix) are case sensitive and will allow this, while others (like Windows) will not.

Unicode

All script files are expected to be encoded in Unicode (currently only UTF8). The file may contain an optional byte-order mark. Normal ASCII text works as well, since ASCII is a subset of UTF8.

TODO: The VM does not currently load files encoded with UTF16 or UTF32, this will be implemented upon request.

Unix scripts

If the first lines begins with a 'shebang' - '#!' - it is ignored. This can be used on Unix systems to run scripts directly from the command line:

#!/usr/bin/mvm
writeln("Hello world!");

The name and location of the 'mvm' program isn't standardized yet though.

Script types

Functions

The engine supports two types of script files. The first is a pure list of statements to execute. These are run basically like functions, and are often called function scripts.

// Example of function script. This is the entire contents of the file.
io.writeln("This is a script");

Function scripts may optionally have a function definition as the first statement in the file:

// Example of function script that takes parameters
function int doSomething(int i, int j);
i += j;
io.writeln("Returning", i);
return i;

Classes

The other type of scripts are classes. They must begin with a valid class, module or singleton declaration.

class MyClass;
 
int i;
func() {}

See function declarations and class declarations for more information.

Tokens and parsing

Monster-script is in the C-family of languages (together with C++, D, Java, C# and many others). Like most of its brethren it is parsed as a series of tokens. These tokens include syntax characters like { and }, operators like +, && and *=, and identifiers like myFunctionName. A complete list of tokens is given below.

Whitespace (newlines, spaces, tabs, etc) are optional and mostly ignored. Thus the following two snippets of code are equivalent:

int sum(int[] list)
{
    int res = 0;
    foreach( v; list )
        res += v;
    return res;
}
int sum(int[]list){int res=0
;foreach(v;list
)res+=v;return   res;}

The only places where whitespace is not ignored, are

  • when separating identifier tokens ('int res' is two identifiers while 'intres' is one)
  • inside string and character literals

Case sensitivity

Monster is case sensitive with regard to all identifiers and keywords:

int myInt;
myint = 3;      // error, not defined
 
int abc;
int ABC;        // ok, abc and ABC are different names
 
int class;      // error, 'class' is a keyword
int Class;      // ok, 'Class' is not a keyword

Comments

A comment is a part of the source code that is completely ignored by the parser. Monster-script has three types of comments:

  • Line comments start with '//' terminate at the end of the line
  • Block comments start with '/*' and can span multiple lines, terminates with '*/'
  • Nested block comments start with '/+' and terminate with '+/', but may contain any number of matching pairs of '/+' and '+/' in between.

Example:

// line comment: ignore the rest of this line

/* block comment: ignore a block of text that
   can span multiple lines */

/+ nested block comments: ignore blocks of text, including other
   /+ nested comments +/
   /+ You /+ can /+ nest /+ these /+ as +/ deep +/ as +/ you +/ like +/
+/

Nested comments are particularly useful for commenting out large pieces of unused code, which might contain comment blocks itself.

Symbols

The tokens can be roughly divided into three categories: symbols, keywords and literals / identifiers. Lets list the symbols first:

(
)
{
}
[
]
,
:
;
.
..
...
$
!
&&
||
++
--
==
!=
=i=
=I=
!=i=
!=I=
<
>
<=
>=
=
+
-
*
/
%
\
~
+=
-=
*=
/=
%=
\=
~=

Keywords

The following names are reserved keywords. This means you can not use any of the following names for your variables, functions, etc.

class
module
return
for
this
new
if
else
foreach
foreach_reverse
do
while
until
continue
break
switch
select
state
struct
enum
import
typeof
singleton
clone
static
const
abstract
override
final
function
with
idle
out
ref
public
private
protected
true
false
native
null
goto
var

Identifiers

 

String literals

 

Character literals

 

Number literals

 
 
 
 
Personal tools