My photo

Mildred's Website

Tags:
My avatar

GoogleTalk, Jabber, XMPP address:
mildred@jabber.fr


GPG Public Key
(Fingerprint 197C A7E6 645B 4299 6D37 684B 6F9D A8D6 9A7D 2E2B)

Category: lisaac

Articles from 5 to 14

Wed 28 Nov 2012, 02:49 PM by Mildred Ki'Lya comp dev en lisaac lysaac

The overlooked problem

Let me explain the reason why I think Lisaac is broken and why I stopped working on it. The fundamental concept behind it is completely broken, and i never could solve the problem. First, let me explain a few things about Lisaac:

  • It is statically typed (like Eiffel)
  • It is prototype based (like Self)

It may not seem like it but those two things are mutually exclusive, unless something is done to reconcile the two. Being statically typed means you know the type (or a parent of the real type) at compile-time. Being prototype based means that the type can change completely.

Imagine you have a CUCUMBER that inherit VEGETABLE. If at some point during run-time, the cucumber inherit SOFTWARE (yes, cucumber is the name of a software as well) you have a problem. Yet, this is not the problem I want to talk about.

Lisaac solve this problem because you can only assign a child type as a parent. In the following object:

 Section Header
   + name := CUCUMBER;
 Section Inherit
   + parent :VEGETABLE := ...;

So, you're forbidden to assign a SOFTWARE in the parent slot because SOFTWARE is not a child of VEGETABLE. But, you're allowed to assign a GREEN_VEGETABLE. So not it becomes:

 Section Header
   + name := CUCUMBER;
 Section Inherit
   + parent :VEGETABLE := GREEN_VEGETABLE;

Now, let's say you send a message to your cucumber, and through the magic of inheritance, you end up in a slot of the parent. That is, you are executing code that is located in GREEN_VEGETABLE. Then, you have a problem.

 Section Header
   + name := GREEN_VEGETABLE;
 Section Public
   // inherited from VEGETABLE
   - do_something <- paint_it_green;
 Section Private
   // specific to GREEN_VEGETABLE
   - paint_it_green <- ( "green".println; );

In the code above, if you are executing the do_something slot and have SELF = CUCUMBER, you can't possibly call the paint_it_green slot because it does not exist in CUCUMBER, and the Lisaac compiler will refuse to compile your code. To explain it another way, the type of Self is not compatible with GREEN_VEGETABLE, and this is a problem for this is the code we are executing. You can't easily solve this problem.

In the Self language, if you had a similar situation, it would work because it is not statically typed. It would have failed to find paint_in_green in the cucumber and called the parent, which would have been a green vegetable. Somehow, the type of the cucumber would have changed at runtime, which is incompatible with a type system that can only compute types at compile time (and have them immutable at run time).

So, I stopped working on Lisaac

I tried to start up a new project, Lysaac, but got confronted to the same problem once the compiler was mature enough to start getting into inheritance. So I stopped again. Life getting in the way didn't help.

A new hope

I tried to look into virtual machines, convincing myself that compiled languages couldn't possibly answer such need to dynamic things in the language. I looked at Self for the first time. I tried to build a virtual machine that was fully concurrent. I looked at the Go language because I know that it has nice concurrent primitives ... and I found illumination.

The Go language feels like a dynamic language, but it is compiled. I always wondered how it could implement duck typing, and I found out. I also realized that the difference between a virtual machine with a JIT compiler, and a runtime for a compiled language that included a compiler was very narrow. Basically, nothing prevents any language from being compiled instead of being JIT compiled. You just have to include a compiler in the standard library.

Ironically, that what Common Lisp is doing.

A new language

I like how Go implements interfaces, and I want to do the same. In this new language, an object is composed of :

  • an opaque work (pointer sized)
  • a pointer to an interface

An interface is a collection of pointers to functions. There is always a pointer for the fallback implementation and a pointer for the case operation (to case an object to a new interface).

I want to use a smalltalk/Io type syntax. So the following will represent an object with one variable and one slot:

 obj := {
   var := obj slot;
   slot <- { ... };
 }

The data word of obj would be a pointer to a struct containing the variable "var" The interface of obj would be:

  • a function pointer for the fallback implementation that throws an error
  • a function pointer for the "cast" operation
  • a function pointer for the "slot" slot

I haven't figured out yet how to pass parameters to functions and return objects. This kind of things will require to comply to predefined interfaces (we need a way to tell which object type is returned by a slot for example). Interfaces would probably be defined at compile time and available as constants in code at run time (to be given as parameter to the cast slot).

If I want to be able to compile on the fly the code, I need to include the LLVM library. So I need to implement the language in a language that is compiled using LLVM (so bootstraping will be easier).

Tue 28 Jun 2011, 12:51 PM comp fr lisaac lysaac misc

Lysaac c'est ma réimplémentation du compilateur lisaac. Jusqu'a présent, il n'y avait pas grand chose, mais dernièrement, il y a eu des commits intéressants:

  • les variables fonctionnent
    • avec des valeurs par défaut
    • on peut les lire
    • et y écrire
  • on a aussi des BLOCKs, mais sans upvalues

Ça ne paye peut être pas de mine, mais en fait, l'infrastructure du compilo est presque complète.

Prochaines avancées: héritage et affichage des erreurs

Et peut être après: des améliorations de syntaxe (appels de slot à paramètres et bien plus tard: opérateurs). Pour le moment, je me concentre sur les choses basiques.

Si vous voulez jouer, vous pouvez. Si vous avez une erreur inattendue, créez un scénario d'utilisation et donnez le moi (préférablement sous forme de fichier .feature).

Tue 28 Jun 2011, 12:51 PM comp en lisaac lysaac misc

Lysaac is my reimplementation of the lisaac compiler. Until now, it wasn't very interesting to look at, but recently, I pushed a few interesting commits:

  • you now have variables
    • the default value is initialized correctly
    • the read works
    • the write works
  • you also have BLOCKs (sorry, no upvalues for now)

This may looks like nothing, but under the hood, the infrastructure is almost completely there.

Next thing to come: inheritance and error reporting.

Then perhaps, syntax improvements like keyword messages and later: operators. For now, I want the basic functionnality working well.

If you want to play with it, you can. If you get an error, create a use-case and propose it as a new feature. Please use as a model the .feature files.

Mon 02 May 2011, 12:04 PM comp dev en lisaac lysaac

This is great: Here is the source files:

c/cstring.li

Section Header

  + name := Reference CSTRING;

  - role := String; // const char*
  - type := Integer 8;

c/main.li

Section Header

  + name := MAIN;

Section Public

  - puts str:CSTRING <- External `puts`;

  - main <-
  (
    puts "Hello World";
  );

You type lysaac compile c >c.bc and you get the following LLVM assembly code:

c.bc

@0 = private constant [12 x i8] c"Hello World\00"


declare void @puts (i8*)

define void @main () {
  %1 = getelementptr [12 x i8]* @0, i32 0, i32 0
  tail call void @puts(i8* %1)
  ret void
}

And you can execute it using the standard LLVM tools:

$ llvm-as < c.bc | lli
Hello World
$

Isn't that great ?

Fri 29 Apr 2011, 09:56 AM comp dev lisaac lysaac

Look at this:

object.li

Section Header

  + name := Singleton NULL;

Section Public

  - is_null :BOOLEAN <- FALSE;

null.li

Section Header

  + name := Singleton NULL;

Section Inherit

  - parent :OBJECT := OBJECT;

Section Public

  - is_null :BOOLEAN <- TRUE;

union.li

Section Header

  + name := UNION;

Section Inherit

  - parent :OBJECT := OBJECT;

union.1.li

Section Header

  + name := Expanded UNION(E);

  - import := E;

Section Inherit

  - parent :UNION := UNION;

Section Public

  + element:E;
  - set_element e:E <- (element := e;);

  - when o:T do blc:{o:T;} <-
  (
    (o = E).if {
      blc.value element;
    };
  );

  - from_e e:E :SELF <-
  ( + res :SELF;
    res := clone;
    res.set_element e;
    res
  );

union.2.li

Section Header

  + name := Expanded UNION(E, F...);

Section Inherit

  + parent_e :UNION(E);
  + parent_next :UNION(F...);

Section Public

  - when o:T do blc:{o:T;} <-
  (
    (o = E).if {
      parent_e.when o do blc;
    } else {
      parent_next.when o do blc;
    };
  );

use.li

Section Header

  + name := USE;

Section Public

  - accept_object_or_null obj:UNION(USE,NULL) <-
  (
    obj
    .when NULL do { o:NULL;
      ? { o.is_null };
    }
    .when USE do { o:USE;
      ? { o.is_null.not };
    };
  );

Fri 29 Apr 2011, 09:52 AM comp dev en lisaac lysaac

Stack environment would be an argument passed implicitely to every function in the code. It would contain global policy. In particular the MEMORY object that lets you allocate memory. If you want to change the allocation policy, you just have to change the current environment, and all functions you call will use the new policy.

We could allow user defined objects like that, not just system objects.

We could also manage errors that way. An error flag could be stored in the environment. Set by the calee and tested by the caller.

Fri 29 Apr 2011, 09:40 AM comp dev en lisaac lysaac

Because I'm using an open world assumption, I need the compiler to generate annotations on units it compiles, so when it sees them again, it knows what it does (or does not) internally.

I was looking at a LLVM video this morning (VMKit precisely) and the person talked about an interesting optimization. What if we could allocate objects in stack instead of the heap. This would save time when creating the object. Then we wouldn't be tempted to avoid creating new objects for fear of memory leaks (there is not garbage collector in lisaac currently) and performance penalty.

This is the same thing as aliased variables in Ada.

An object can be allocated on the stack if:

  • it is not returned by the function.
  • it is not stored on the heap by the function.
  • it is not used in a called function that would store a pointer to this object on the heap.

So, when the compiler compiles a cluster, it has to generate an annotation file containing for each argument in each code slot whether the argument is guaranteed to remain on the stack or if it might be stored on the heap. If an argument is guaranteed to stay on the stack, we can allocate it on the stack. When the function will return, the only instances would be located in the current stack frame.

Fri 29 Apr 2011, 09:19 AM comp dev en lisaac lysaac

In Lysaac, I choose to follow the open world assumption, like the majority of programming languages out there, instead of the closed world assumption. There are two main reasons:

  • First, I don't strive at creating an optimizing compiler, not yet at least. Closed world is useful for that, but I don't need it.

  • Second, open world assumption increases the complexity a lot. The Lisaac compiler uses an exponential algorithm, and will always hit a limit with big projects. With an open world, you can partition the complexity.

Because I still believe in global compilation, I decided that my compilation unit would be the cluster instead of the prototype. That is, I'll compile a cluster completely in one object file. That makes it possible to optimize things like private prototypes.

This leaves a big performance problem for BOOLEANs in particular. BOOLEAN, TRUE and FALSE are prototypes in the standard library, and having an open world assumption would require pasing to the if then slot function pointers. I can't realisticly do that.

So, These prototypes could be marked as Inline. They are separated from their cluster and gets compiled in every cluster that uses them. The syntax could be quite simple:

Section Header

  + name := Inline TRUE;

But, because each cluster is then free to compile it as it wants, there is a problem of interoperability. How can you be sure that the TRUE in your cluster is compiled the same way as in the neighbooring cluster you are using. As it is, you can't pass TRUE object around clusters. Very annoying.

The solution would be to encode them and decode them manually. You could have:

Section Header

  + name := Inline TRUE;

Section Feature

  - inline_size :INTEGER := 0;

Take a more interesting example:

Section Header

  + name := Inline Expanded BIT;

  - size := 1;

Section Feature

  - inline_size :INTEGER := 1;
  - encode p:POINTER <-
  (
    p.to_native_array_of BIT.put bit to 0;
  );
  - decode p:POINTER <-
  (
    data := p.to_native_array_of BIT.item 0;
  );

This needs to be refined.

Additionally, .cli files could also contain the Inline keyword. In that case, the cluster it reference will be compiled with the current cluster. That could be useful for private clusters.

Tue 12 Apr 2011, 04:35 PM comp dev en lisaac lysaac

If you noticed, I started my own Lisaac compiler, called Lysaac and I want to make it a little bit different from Lisaac. I'll try to keep compatibility, but for few things, I might take a different direction.

One of these things is the way prototypes are found.

In Lisaac, you have a complete set of prototypes and when you look for a prototype, it is looked everywhere. This is not desirable. Imagine you are writing a library that requires the prototype FOO. Currently, if FOO is not present in the library, instead of issuing an error, the compiler would take the FOO prototype in the application that use the library. Meaning that the library is effectvely using a pieve of the application code.

I want to take the SmartEiffel approach and separate the source code in few clusters. A cluster is a collection of prototypes. And the prototypes in a cluster can only use the prototype of the same cluster or the prototypes of imported clusters. This solve the above dependancy problem.

A cluster is a directory that contain prototypes in .li files and subdirectories. If a subdirectory do not contain .li files, the sub-subdirectories are not recursively searched. A cluster can import another cluster using a cluster file ending with .cli.

An example of .cli file is as follows:

Section Header

  - name := Cluster LIBFOO;

  - path := ("libfoo-3.14", "../libfoo");

The search paths can be:

  • relative to the .cli file if it starts with .
  • relative to LYSAAC_PATH directories otherwise

LISAAC_PATH defaults to $XDG_DATA_HOME/lysaac/lib:/usr/local/share/lysaac/lib:/usr/share/lysaac/lib.

The search paths would then be for this example:

  • $XDG_DATA_HOME/lysaac/lib/libfoo-3.14
  • /usr/local/share/lysaac/lib/libfoo-3.14
  • /usr/share/lysaac/lib/libfoo-3.14
  • ../libfoo

The parser for these files is being written. Then you can see the complete hierarchy of the project:

$ lysaac src
◆ Root Cluster
│ Cluster in: src
├─◆ LIB (src/lib.cli)
│ │ Cluster in: lib
│ ├─◆ STDLIB (lib/stdlib.cli)
│ │ │ Cluster in: /home/mildred/.local/share/lysaac/lib/stdlib
│ │ ├─◇ STRING (...)
│ │ ├─◇ ABSTRACT_STRING (...)
│ │ ╰─◇ ...
│ ├─◇ LIBC (lib/libc.li)
│ ╰─◇ CSTRING (lib/cstring.li)
├─◇ PARSER (src/parser.li)
├─◇ CLUSTER_ITEM (src/cluster_item.li)
├─◇ ITM_STYLE (src/itm_style.li)
├─◇ LYSAAC (src/lysaac.li)
├─◇ ITM_AFFECT (src/itm_affect.li)
├─◇ ANY (src/any.li)
├─◇ PARSER_CLI (src/parser_cli.li)
╰─◇ CLUSTER (src/cluster.li)

Now, each item in a cluster can be public or private. Public items are available to the users of the clusters whereas private items are restricted to members of the same cluster. To declare a private item, just say:

Section Header

  + name := Private PROTOTYPE;

or

Section Header

  - name := Private Cluster LIBTOTO;

If you want to declare a whole bunch of prototypes private to your cluster, just include them in a private cluster. To do so, you'll need the following files:

  • cluster/my_private_prototypes.cli:

    Section Header
    
      - name := Private Cluster MY_PRIVATE_PROTOTYPES;
    
      - path := "./deps/my_private_prototypes";
      // makes the cluster relative to the .cli file
      // use a deps additional directory to avoid the current cluster to
      // look into deps/my_private_prototypes. deps should not contain any
      // files, just subdirectories.
    
  • cluster/deps/my_private_prototypes/private_proto.li:

    Section Header
    
      + name := Public Prototype PRIVATE_PROTO;
    

Tue 29 Mar 2011, 02:34 PM comp en lisaac

Inspired from ruby #Array:

Section Header

  + name := PIPE(E);

Section Private

  + blc :{;BOOLEAN,E};

Section Public

  - create blc_:{;BOOLEAN,E} :SELF <-
  (
    blc := blc_;
    Self
  );

  - read :(BOOLEAN,E) <- blc.value

  - map blc:{E;X} :PIPE(X) <-
  ( // Upvalue Self
    PIPE(X).clone.make {
      + have_e :BOOLEAN;
      + e :E;
      (have_e, e) := Self.read;
      have_e.if {
        e := blc.value e;
      };
      have_e, e
    }
  );

  - select blc:{E;BOOLEAN} :PIPE(E) <-
  ( // Upvalue Self
    PIPE(E).clone.make {
      + have_e :BOOLEAN;
      + e :E;
      (have_e, e) := Self.read;
      have_e.if {
        blc.value e.if_false {
          have_e := FALSE;
        };
      };
      have_e, e
    }
  );

  - flatten_1 :PIPE(X) <-
  ( // Upvalue Self, buffer
    + buffer :PIPE(X);
    PIPE(X).clone.make {
      + have_e, have_x :BOOLEAN;
      + e :E;
      + x :X;
      buffer.is_null.if_false {
        (have_e, e) := buffer.read;
        have_e.if_false {
          buffer := NULL;
        };
      };
      buffer.is_null.if {
        (have_e, e) := Self.read;
        have_e.if {
          (buffer ?= e).is_null.if_false {
            (have_x, x) := buffer.read
            // TODO: cas où have_e = FALSE (buffer est vide)
          };
        };
      };
      have_x, x
    }
  );

  - to_array_of_e :ARRAY(E) <-
  ( + result :ARRAY(E);
    + have_e :BOOLEAN;
    + e :E;
    {(have_e, e) := read;
    have_e
    }.while_do {
      result.add_last e;
    };
    result
  );

TODO:

  • Upvalue
  • Syntax improvement (block input variables is too verbose)
  • More typing of parameter types, particurlarly as return values (map return unknown type X)
  • var := bool.if {val1} else {val2}
  • NULL.is_null -> TRUE
Page: