My photo

Mildred's Website

Tags:
My avatar

GoogleTalk, Jabber, XMPP address:
mildred@jabber.fr


GPG Public Key
(Fingerprint 197C A7E6 645B 4299 6D37 684B 6F9D A8D6 9A7D 2E2B)

Category: dev

Articles from 2 to 11

Wed 19 Oct 2011, 10:21 AM by Mildred Ki'Lya comp dev en privacy web

What is this all about: web privacy. We are tracked everywhere, and i'd like to help if possible. So, let's design a web browser that for once respects your privacy.

Main features:

  • Each website has its own cookie jar, its own cache and its own HTML5 local storage
  • History related css attributes are disabled
  • External plugins are only enabled on demand
  • Support for Tor/I2P is enabled by default
  • You have complete control over who receives what information
  • Let's you control in the settings if you want to allow referrers or not.
  • The contextual menu let's you open links anonymously (no referrer, anonymous session)

The browser is bundled with a particular UI that let you control everything during your browsing session. it is non intrusive and makes the best choice by default. I am thinking of a notification bar that shows at the bottom. I noticed that this place is not intrusive when I realized that the search bar in Firefox was most of the time open, even if most of the time I didn't use it.

First, let's define a session:

  • A session can be used my more than one domain at the same time.
  • A session is associated to a specific cache storage
  • A session is associated to a specific HTML5 storage
  • A session is associated to a specific cookie jar
  • A session acn be closed. When it is closed, session cookies are deleted from the session
  • A session can be reopened. All long lasting cookies, cache, HTML5 storage and such is then used again.
  • A session can be anonymous. In such a case, the session is deleted completely when it is closed.
  • A session is associated to none, one or more domains. These domains are the domains the end user can see in the address bar, not the sub items in the page.

Sessions are like Firefox profiles. If you iopen a new session, it's like you opened a new Firefox profile you just created. Because people will never create a different Firefox profile for each site.

If we want to protect privacy, when a link is opened, a new session should be created each time. To make it usable to browse web sites, it is made possible to share sessions in specific cases. Let's define the cases where it might be intelligent to share a profile:

  • You click a link or submit a form and expect to still be logged-in in the site you are viewing. You don't care if you follow a link to an external page.

    User Interface: If the link matches one of the domains of the session, then keep the session. No UI. If the user wanted a new session, the "Open anonymously" entry in the context menu exists. A button on the toolbar might be available to enter a state where we always want to open links anonymously.

    If the link points to another domain, then open the link in a new session unless "Open in the same session" was specified in the context menu. The UI contains:

    We Protected your privacy by separating <domain of the new site> from
    the site you were visiting previously (<domain of the previous site>).
    
    Choices: [ (1) Create a new anonymous session          | ▼ ]
             | (2) Continue session from <previous domain> |
             | (3) Use a previous session for <new domain> |
             | (4) Use session from bookmark "<name>"      |
    

    The first choice is considered the default and the page is loaded with it. If the user chooses a new option, then the page is reloaded.

    If the user chooses (2), the page is reloaded with the previous session and the user will be asked if "Do you want and to have access to the same private information about you?". Answers are Yes, No and Always. If the answer is Always, then in the configuration, the two domains are considered one and the same.

    The choice (3) will use the most recent session for the new domain. It might be a session currently in use or a session in the history.

    There are as many (4) options as there are bookmarks for the new domain. If different bookmarks share a single session, only one bookmark is shown. This choice will load the session from the bookmark.

    If (3) and (4) are the same sessions, and there is only one bookmark (4), then the (4) option is left out.

  • You use a bookmark and expect to continue the session you had for this bookmark (webmails)

    The session is simpely stored in the bookmark. When saving a bookmark, there is an option to store the session with it or not.

    [X] Do not save any personal information with this bookmark
    
  • You open a new URL and you might want reuse a session that was opened for this URL.

    The User Interface allows you to restore the session:

    We protected your privacy by not sending any personal information to
    <domain>. If you want <domain> to receive private information, please
    select:
    
    Choices: [ Do not send private information     | ▼ ]
             | Use a previous session for <domain> |
             | Use session from bookmark "<name>"  |
    

If you can see other use cases, please comment on that.

From these use cases, I can infer three kind of sessions:

  • Live sessions, currently in use
  • Saved sessions, associated to a bookmark
  • Closed sessions in the past, accessible using history. Collected after a too long time.

Now, how to implement that? I was thinking of QtWebKit as I already worked with Qt and it's easy to work with.

  • We have a main widget: QWebView. We want to change the session when a new page is loaded. So we hook up with the signal loadStarted.
  • We prevent history related CSS rules by implementing QWebHistoryInterface, more specifically, we store the history necessary to implement QWebHistoryInterface in the session.
  • We change the cache by implementing QAbstractNetworkCache and setting it using view->page()->networkAccessManager()->setCache(...)
  • We change the cookie jar by implementing QNetworkCookieJar and setting it using view->page()->networkAccessManager()->setCookieJar(...)
  • Change the local storage path using a directory dedicated for the session and using view->page()->settings()->setLocalStoragePath(QString)

After all that, we'll have to inspect the resulting browser to determine if there are still areas where we fail at protecting privacy.

Mon 09 May 2011, 02:40 PM comp dev en

FAQ

  • What is HMP: It is a messaging protocol destined to replace e-mails.

  • Why replace e-mails: Because it is full of spam and unmaintainable. This alternative is lighter and easier to implement than a full SMTP server with SPAM management.

  • Why use HTTP: I'm not fan of putting everything over HTTP but it has its advantages:

    • It has a security layer (HTTPS)
    • It is (relatively) simple and implemented everywhere
    • It manages content-types and different types of requests
    • It is extensible
    • It goes easily through proxys and NATs
    • It allows multiplexing many different resources on the same server

    In the long run, perhaps we should move away from HTTP as:

    • It is too associated with the web
    • It doesn't allow initiative from the server.

    WebSockets could be a good alternative one day.

  • How do I get my messages: Not specified, although you could possibly authenticate using a standard HTTP method to the same resource as your address and issue a GET command.

  • Does this allows a web implementation: Yes, it will need to be further specified but if the server detects a browser request (without the HMP headers) on the resource, it could issue a web-page with a form.

  • Is the message format specified: no, it needs to be. I plan on using JSON.

  • Do you plan an implementation: Yes, using probably node.js or Lua.

  • What prompted this: The Tor network doesn't have any standard messaging system. I don't believe SMTP is suited for that.

  • Why write this spec, you have no code to back this up: because I like writing specs, and it's a way for me to remind me to write the code, and to tell me how I should write it. I might not get the time to write this as soon as I want.

What is a hmp address

Scheme:

[hmp:]server[:port][/path]

Example:

hmp:gmail.com:80/user
gmail.com:80/user
gmail.com/user

domain.org/u/alicia

Translation to HTTPS resources

A HMP address can directly be translated to an HTTPS resource. The standard scheme translates to:

https://server:port/path

Message sending overview

To send a message from domain.org/alicia to users.net/~bob, the sequence is:

  • Connection to users.net:

    [1] POST https://users.net/~bob
    [1] HMP-Pingback: 235
    [1] HMP-From: domain.org/alicia
    [1] Content: message-content
    
  • users.net opens a connection to domain.org

    [2] GET https://domain.org/alicia
    [2] HMP-Pingback: 235
    [2] HMP-Method: MD5
    
  • domain.org responds to users.net

    [2] HMP-Hash: ef0167eca19bb2d4c8dfe4c3803cc204
    [2] Status: 200
    
  • users.net responds to the original sender

    [1] Status: 200
    

Headers to the POST request

The POST request is the request used to post a message. It contains two specific headers:

  • HMP-From: The address the message is sent from

  • HMP-Pingback: A sequence number that uniquely identifies the message for the sender. it needs not be unique, as long as at ont point in time, there are only one message corresponding to this ID.

Particular status codes:

  • 200 in case of success

  • 403 in case the From address could not be authenticated

From address authentication, pingpack

In order to avoid SPAM, the sender must be authenticated when the message is sent. For this reason, before accepting or rejecting the request, the server must initiate a pingback procedure to the sender.

First, the From address is converted to an HTTPS resource and a GET connection is initiated. The specific request-headers are:

  • HMP-Pingback: the pingback sequence number from the previous request

  • HMP-Method: method for verifying the originating message. The only specified method is "MD5"

MD5 Method

In case the message is recognized, the from server responds with the following header:

  • HMP-Hash: MD5 hash of the content of the message identified by the pingback identifier

The status code can be:

  • 200 in case the message was recognized

  • 404 in case the message was not found

If the MD5 sum corresponds to the message received and a success code was given, the from is verified and the message can be sent.

Mon 02 May 2011, 12:04 PM comp dev en lisaac lysaac

This is great: Here is the source files:

c/cstring.li

Section Header

  + name := Reference CSTRING;

  - role := String; // const char*
  - type := Integer 8;

c/main.li

Section Header

  + name := MAIN;

Section Public

  - puts str:CSTRING <- External `puts`;

  - main <-
  (
    puts "Hello World";
  );

You type lysaac compile c >c.bc and you get the following LLVM assembly code:

c.bc

@0 = private constant [12 x i8] c"Hello World\00"


declare void @puts (i8*)

define void @main () {
  %1 = getelementptr [12 x i8]* @0, i32 0, i32 0
  tail call void @puts(i8* %1)
  ret void
}

And you can execute it using the standard LLVM tools:

$ llvm-as < c.bc | lli
Hello World
$

Isn't that great ?

Fri 29 Apr 2011, 09:56 AM comp dev lisaac lysaac

Look at this:

object.li

Section Header

  + name := Singleton NULL;

Section Public

  - is_null :BOOLEAN <- FALSE;

null.li

Section Header

  + name := Singleton NULL;

Section Inherit

  - parent :OBJECT := OBJECT;

Section Public

  - is_null :BOOLEAN <- TRUE;

union.li

Section Header

  + name := UNION;

Section Inherit

  - parent :OBJECT := OBJECT;

union.1.li

Section Header

  + name := Expanded UNION(E);

  - import := E;

Section Inherit

  - parent :UNION := UNION;

Section Public

  + element:E;
  - set_element e:E <- (element := e;);

  - when o:T do blc:{o:T;} <-
  (
    (o = E).if {
      blc.value element;
    };
  );

  - from_e e:E :SELF <-
  ( + res :SELF;
    res := clone;
    res.set_element e;
    res
  );

union.2.li

Section Header

  + name := Expanded UNION(E, F...);

Section Inherit

  + parent_e :UNION(E);
  + parent_next :UNION(F...);

Section Public

  - when o:T do blc:{o:T;} <-
  (
    (o = E).if {
      parent_e.when o do blc;
    } else {
      parent_next.when o do blc;
    };
  );

use.li

Section Header

  + name := USE;

Section Public

  - accept_object_or_null obj:UNION(USE,NULL) <-
  (
    obj
    .when NULL do { o:NULL;
      ? { o.is_null };
    }
    .when USE do { o:USE;
      ? { o.is_null.not };
    };
  );

Fri 29 Apr 2011, 09:52 AM comp dev en lisaac lysaac

Stack environment would be an argument passed implicitely to every function in the code. It would contain global policy. In particular the MEMORY object that lets you allocate memory. If you want to change the allocation policy, you just have to change the current environment, and all functions you call will use the new policy.

We could allow user defined objects like that, not just system objects.

We could also manage errors that way. An error flag could be stored in the environment. Set by the calee and tested by the caller.

Fri 29 Apr 2011, 09:40 AM comp dev en lisaac lysaac

Because I'm using an open world assumption, I need the compiler to generate annotations on units it compiles, so when it sees them again, it knows what it does (or does not) internally.

I was looking at a LLVM video this morning (VMKit precisely) and the person talked about an interesting optimization. What if we could allocate objects in stack instead of the heap. This would save time when creating the object. Then we wouldn't be tempted to avoid creating new objects for fear of memory leaks (there is not garbage collector in lisaac currently) and performance penalty.

This is the same thing as aliased variables in Ada.

An object can be allocated on the stack if:

  • it is not returned by the function.
  • it is not stored on the heap by the function.
  • it is not used in a called function that would store a pointer to this object on the heap.

So, when the compiler compiles a cluster, it has to generate an annotation file containing for each argument in each code slot whether the argument is guaranteed to remain on the stack or if it might be stored on the heap. If an argument is guaranteed to stay on the stack, we can allocate it on the stack. When the function will return, the only instances would be located in the current stack frame.

Fri 29 Apr 2011, 09:19 AM comp dev en lisaac lysaac

In Lysaac, I choose to follow the open world assumption, like the majority of programming languages out there, instead of the closed world assumption. There are two main reasons:

  • First, I don't strive at creating an optimizing compiler, not yet at least. Closed world is useful for that, but I don't need it.

  • Second, open world assumption increases the complexity a lot. The Lisaac compiler uses an exponential algorithm, and will always hit a limit with big projects. With an open world, you can partition the complexity.

Because I still believe in global compilation, I decided that my compilation unit would be the cluster instead of the prototype. That is, I'll compile a cluster completely in one object file. That makes it possible to optimize things like private prototypes.

This leaves a big performance problem for BOOLEANs in particular. BOOLEAN, TRUE and FALSE are prototypes in the standard library, and having an open world assumption would require pasing to the if then slot function pointers. I can't realisticly do that.

So, These prototypes could be marked as Inline. They are separated from their cluster and gets compiled in every cluster that uses them. The syntax could be quite simple:

Section Header

  + name := Inline TRUE;

But, because each cluster is then free to compile it as it wants, there is a problem of interoperability. How can you be sure that the TRUE in your cluster is compiled the same way as in the neighbooring cluster you are using. As it is, you can't pass TRUE object around clusters. Very annoying.

The solution would be to encode them and decode them manually. You could have:

Section Header

  + name := Inline TRUE;

Section Feature

  - inline_size :INTEGER := 0;

Take a more interesting example:

Section Header

  + name := Inline Expanded BIT;

  - size := 1;

Section Feature

  - inline_size :INTEGER := 1;
  - encode p:POINTER <-
  (
    p.to_native_array_of BIT.put bit to 0;
  );
  - decode p:POINTER <-
  (
    data := p.to_native_array_of BIT.item 0;
  );

This needs to be refined.

Additionally, .cli files could also contain the Inline keyword. In that case, the cluster it reference will be compiled with the current cluster. That could be useful for private clusters.

Wed 20 Apr 2011, 11:17 AM comp dev lysaac

◆ Root Cluster
│ Cluster in: src
├─◆ LIB (src/lib.cli)
│ │ Cluster in: src/../lib
│ ├─◇ PATH_HELPER (src/../lib/path_helper.li)
│ ├─◇ CSTRING (src/../lib/cstring.li)
│ ╰─◇ LIBC (src/../lib/libc.li)
├─◇ PARSER (src/parser.li)
├─◆ STDLIB (src/stdlib.cli)
│ │ Cluster in: src/../stdlib/standard
│ ├─◆ INTERNAL (src/../stdlib/standard/internal.cli)
│ │ │ Cluster in: src/../stdlib/standard/../internal
│ │ ├─◆ PORTABLE (src/../stdlib/standard/../internal/portable.cli)
│ │ │ │ Cluster in: src/../stdlib/standard/../internal/portable
│ │ │ ├─◇ FLOAT_REAL (src/../stdlib/standard/../internal/portable/number/float_real.li)
│ │ │ ├─◇ FIXED_REAL (src/../stdlib/standard/../internal/portable/number/fixed_real.li)
│ │ │ ├─◇ FLOAT_MAP80 (src/../stdlib/standard/../internal/portable/number/float_map80.li)
│ │ │ ├─◇ SIGNED_INTEGER (src/../stdlib/standard/../internal/portable/number/signed_integer.li)
│ │ │ ├─◇ FLOAT_MAP32 (src/../stdlib/standard/../internal/portable/number/float_map32.li)
│ │ │ ├─◇ UNSIGNED_INTEGER (src/../stdlib/standard/../internal/portable/number/unsigned_integer.li)
│ │ │ ├─◇ FLOAT_MAP64 (src/../stdlib/standard/../internal/portable/number/float_map64.li)
│ │ │ ├─◇ SIGNED_FIXED_REAL (src/../stdlib/standard/../internal/portable/number/signed_fixed_real.li)
│ │ │ ├─◇ NUMERIC (src/../stdlib/standard/../internal/portable/number/numeric.li)
│ │ │ ├─◇ FLOAT_MAP (src/../stdlib/standard/../internal/portable/number/float_map.li)
│ │ │ ├─◇ UNSIGNED_FIXED_REAL (src/../stdlib/standard/../internal/portable/number/unsigned_fixed_real.li)
│ │ │ ├─◇ FILE_INPUT_STREAM (src/../stdlib/standard/../internal/portable/io/file_input_stream.li)
│ │ │ ├─◇ STD_INPUT_OUTPUT (src/../stdlib/standard/../internal/portable/io/std_input_output.li)
│ │ │ ├─◇ FILE_OUTPUT_STREAM (src/../stdlib/standard/../internal/portable/io/file_output_stream.li)
│ │ │ ├─◇ INPUT_STREAM (src/../stdlib/standard/../internal/portable/io/input_stream.li)
│ │ │ ├─◇ OUTPUT_STREAM (src/../stdlib/standard/../internal/portable/io/output_stream.li)
│ │ │ ├─◇ MEMORY (src/../stdlib/standard/../internal/portable/memory/memory.li)
│ │ │ ├─◇ SYSTEM_DETECT (src/../stdlib/standard/../internal/portable/system/system_detect.li)
│ │ │ ├─◇ HASHED_DICTIONARY_NODE (src/../stdlib/standard/../internal/portable/collection/hashed_dictionary_node.li)
│ │ │ ├─◇ COLLECTION (src/../stdlib/standard/../internal/portable/collection/collection.li)
│ │ │ ├─◇ HASH_TABLE_SIZE (src/../stdlib/standard/../internal/portable/collection/hash_table_size.li)
│ │ │ ├─◇ ANY_HASHED_BIJECTIVE_DICTIONARY_NODE (src/../stdlib/standard/../internal/portable/collection/any_hashed_bijective_dictionary_node.li)
│ │ │ ├─◇ ANY_LINKED_LIST_NODE (src/../stdlib/standard/../internal/portable/collection/any_linked_list_node.li)
│ │ │ ├─◇ ANY_AVL_SET_NODE (src/../stdlib/standard/../internal/portable/collection/any_avl_set_node.li)
│ │ │ ├─◇ LINKED2_LIST_NODE (src/../stdlib/standard/../internal/portable/collection/linked2_list_node.li)
│ │ │ ├─◇ ANY_AVL_DICTIONARY_NODE (src/../stdlib/standard/../internal/portable/collection/any_avl_dictionary_node.li)
│ │ │ ├─◇ COLLECTION3 (src/../stdlib/standard/../internal/portable/collection/collection3.li)
│ │ │ ├─◇ SET (src/../stdlib/standard/../internal/portable/collection/set.li)
│ │ │ ├─◇ ANY_TWO_WAY_LINKED_LIST_NODE (src/../stdlib/standard/../internal/portable/collection/any_two_way_linked_list_node.li)
│ │ │ ├─◇ HASHED_SET_NODE (src/../stdlib/standard/../internal/portable/collection/hashed_set_node.li)
│ │ │ ├─◇ ARRAYED_COLLECTION (src/../stdlib/standard/../internal/portable/collection/arrayed_collection.li)
│ │ │ ├─◇ SIMPLE_DICTIONARY (src/../stdlib/standard/../internal/portable/collection/simple_dictionary.li)
│ │ │ ├─◇ DICTIONARY (src/../stdlib/standard/../internal/portable/collection/dictionary.li)
│ │ │ ├─◇ AVL_DICTIONARY_NODE (src/../stdlib/standard/../internal/portable/collection/avl_dictionary_node.li)
│ │ │ ├─◇ AVL_CONSTANTS (src/../stdlib/standard/../internal/portable/collection/avl_constants.li)
│ │ │ ├─◇ ANY_HASHED_SET_NODE (src/../stdlib/standard/../internal/portable/collection/any_hashed_set_node.li)
│ │ │ ├─◇ NATIVE_ARRAY (src/../stdlib/standard/../internal/portable/collection/native_array.li)
│ │ │ ├─◇ AVL_TREE (src/../stdlib/standard/../internal/portable/collection/avl_tree.li)
│ │ │ ├─◇ NATIVE_ARRAY_VOLATILE (src/../stdlib/standard/../internal/portable/collection/native_array_volatile.li)
│ │ │ ├─◇ COLLECTION2 (src/../stdlib/standard/../internal/portable/collection/collection2.li)
│ │ │ ├─◇ ANY_HASHED_DICTIONARY_NODE (src/../stdlib/standard/../internal/portable/collection/any_hashed_dictionary_node.li)
│ │ │ ├─◇ ARRAYED (src/../stdlib/standard/../internal/portable/collection/arrayed.li)
│ │ │ ├─◇ AVL_SET_NODE (src/../stdlib/standard/../internal/portable/collection/avl_set_node.li)
│ │ │ ├─◇ LINKED_XOR_NODE (src/../stdlib/standard/../internal/portable/collection/linked_xor_node.li)
│ │ │ ├─◇ LINKED_LIST_NODE (src/../stdlib/standard/../internal/portable/collection/linked_list_node.li)
│ │ │ ├─◇ AVL_TREE_NODE (src/../stdlib/standard/../internal/portable/collection/avl_tree_node.li)
│ │ │ ├─◇ LINKED_COLLECTION (src/../stdlib/standard/../internal/portable/collection/linked_collection.li)
│ │ │ ├─◇ FS_MIN (src/../stdlib/standard/../internal/portable/file_system/fs_min.li)
│ │ │ ├─◇ STRING_BUFFER (src/../stdlib/standard/../internal/portable/string/string_buffer.li)
│ │ │ ╰─◇ CHARACTER_REF (src/../stdlib/standard/../internal/portable/string/character_ref.li)
│ │ ╰─◆ UNIX (src/../stdlib/standard/../internal/unix.cli)
│ │   │ Cluster in: src/../stdlib/standard/../internal/os_support/unix
│ │   ├─◇ FLOAT_PROCESSOR (src/../stdlib/standard/../internal/os_support/unix/system/float_processor.li)
│ │   ├─◇ SYSTEM (src/../stdlib/standard/../internal/os_support/unix/system/system.li)
│ │   ├─◇ CLOCK (src/../stdlib/standard/../internal/os_support/unix/system/clock.li)
│ │   ├─◇ ENVIRONMENT (src/../stdlib/standard/../internal/os_support/unix/system/environment.li)
│ │   ├─◇ SYSTEM_IO (src/../stdlib/standard/../internal/os_support/unix/system/system_io.li)
│ │   ├─◇ PROCESSOR (src/../stdlib/standard/../internal/os_support/unix/system/processor.li)
│ │   ├─◇ EVENT_SYSTEM (src/../stdlib/standard/../internal/os_support/unix/video/event_system.li)
│ │   ├─◇ KEYBOARD (src/../stdlib/standard/../internal/os_support/unix/video/keyboard.li)
│ │   ├─◇ TIMER (src/../stdlib/standard/../internal/os_support/unix/video/timer.li)
│ │   ├─◇ VIDEO (src/../stdlib/standard/../internal/os_support/unix/video/video.li)
│ │   ├─◇ MOUSE (src/../stdlib/standard/../internal/os_support/unix/video/mouse.li)
│ │   ├─◇ FILE_UNIX (src/../stdlib/standard/../internal/os_support/unix/file_system/file_unix.li)
│ │   ├─◇ FILE_SYSTEM (src/../stdlib/standard/../internal/os_support/unix/file_system/file_system.li)
│ │   ├─◇ ENTRY_UNIX (src/../stdlib/standard/../internal/os_support/unix/file_system/entry_unix.li)
│ │   ├─◇ DIRECTORY_UNIX (src/../stdlib/standard/../internal/os_support/unix/file_system/directory_unix.li)
│ │   ├─◇ BMP_LINE_ASCII (src/../stdlib/standard/../internal/os_support/unix/video_ascii/bmp_line_ascii.li)
│ │   ├─◇ BITMAP_ASCII (src/../stdlib/standard/../internal/os_support/unix/video_ascii/bitmap_ascii.li)
│ │   ├─◇ VIDEO (src/../stdlib/standard/../internal/os_support/unix/video_ascii/video.li)
│ │   ╰─◇ PIXEL_ASCII (src/../stdlib/standard/../internal/os_support/unix/video_ascii/pixel_ascii.li)
│ ├─◇ STD_ERROR (src/../stdlib/standard/io/std_error.li)
│ ├─◇ COMMAND_LINE (src/../stdlib/standard/io/command_line.li)
│ ├─◇ IO (src/../stdlib/standard/io/io.li)
│ ├─◇ STD_INPUT (src/../stdlib/standard/io/std_input.li)
│ ├─◇ STD_OUTPUT (src/../stdlib/standard/io/std_output.li)
│ ├─◇ TIME (src/../stdlib/standard/time/time.li)
│ ├─◇ DATE (src/../stdlib/standard/time/date.li)
│ ├─◇ HASHABLE (src/../stdlib/standard/property/hashable.li)
│ ├─◇ COMPARABLE (src/../stdlib/standard/property/comparable.li)
│ ├─◇ SAFE_EQUAL (src/../stdlib/standard/property/safe_equal.li)
│ ├─◇ TRAVERSABLE (src/../stdlib/standard/property/traversable.li)
│ ├─◇ OBJECT (src/../stdlib/standard/kernel/object.li)
│ ├─◇ I_DONT_KNOW_PROTOTYPING (src/../stdlib/standard/kernel/i_dont_know_prototyping.li)
│ ├─◇ POINTER (src/../stdlib/standard/kernel/pointer.li)
│ ├─◇ CONVERT (src/../stdlib/standard/kernel/convert.li)
│ ├─◇ REFERENCE (src/../stdlib/standard/kernel/reference.li)
│ ├─◇ BLOCK (src/../stdlib/standard/kernel/block.li)
│ ├─◇ HASHED_DICTIONARY (src/../stdlib/standard/collection/hashed_dictionary.li)
│ ├─◇ ARRAY2 (src/../stdlib/standard/collection/array2.li)
│ ├─◇ AVL_SET (src/../stdlib/standard/collection/avl_set.li)
│ ├─◇ LINKED2_LIST (src/../stdlib/standard/collection/linked2_list.li)
│ ├─◇ ARRAY3 (src/../stdlib/standard/collection/array3.li)
│ ├─◇ ARRAY (src/../stdlib/standard/collection/array.li)
│ ├─◇ ITERATOR (src/../stdlib/standard/collection/iterator.li)
│ ├─◇ FAST_ARRAY3 (src/../stdlib/standard/collection/fast_array3.li)
│ ├─◇ LINKED_XOR_LIST (src/../stdlib/standard/collection/linked_xor_list.li)
│ ├─◇ LINKED_LIST (src/../stdlib/standard/collection/linked_list.li)
│ ├─◇ HASHED_SET (src/../stdlib/standard/collection/hashed_set.li)
│ ├─◇ FAST_ARRAY2 (src/../stdlib/standard/collection/fast_array2.li)
│ ├─◇ FAST_ARRAY (src/../stdlib/standard/collection/fast_array.li)
│ ├─◇ AVL_DICTIONARY (src/../stdlib/standard/collection/avl_dictionary.li)
│ ├─◇ STD_FILE (src/../stdlib/standard/file_system/std_file.li)
│ ├─◇ DIRECTORY (src/../stdlib/standard/file_system/directory.li)
│ ├─◇ ENTRY (src/../stdlib/standard/file_system/entry.li)
│ ├─◇ FILE (src/../stdlib/standard/file_system/file.li)
│ ├─◇ HTTP_SERVER (src/../stdlib/standard/http/http_server.li)
│ ├─◇ HTTP_HEADER (src/../stdlib/standard/http/http_header.li)
│ ├─◇ FALSE (src/../stdlib/standard/boolean/false.li)
│ ├─◇ BOOLEAN (src/../stdlib/standard/boolean/boolean.li)
│ ├─◇ TRUE (src/../stdlib/standard/boolean/true.li)
│ ├─◇ STRING_CONSTANT (src/../stdlib/standard/string/string_constant.li)
│ ├─◇ STRING (src/../stdlib/standard/string/string.li)
│ ├─◇ ABSTRACT_STRING (src/../stdlib/standard/string/abstract_string.li)
│ ╰─◇ CHARACTER (src/../stdlib/standard/string/character.li)
├─◇ CLUSTER_ITEM (src/cluster_item.li)
├─◇ ITM_STYLE (src/itm_style.li)
├─◇ LYSAAC (src/lysaac.li)
├─◇ ITM_AFFECT (src/itm_affect.li)
├─◇ ANY (src/any.li)
├─◇ PARSER_CLI (src/parser_cli.li)
╰─◇ CLUSTER (src/cluster.li)

Tue 12 Apr 2011, 04:35 PM comp dev en lisaac lysaac

If you noticed, I started my own Lisaac compiler, called Lysaac and I want to make it a little bit different from Lisaac. I'll try to keep compatibility, but for few things, I might take a different direction.

One of these things is the way prototypes are found.

In Lisaac, you have a complete set of prototypes and when you look for a prototype, it is looked everywhere. This is not desirable. Imagine you are writing a library that requires the prototype FOO. Currently, if FOO is not present in the library, instead of issuing an error, the compiler would take the FOO prototype in the application that use the library. Meaning that the library is effectvely using a pieve of the application code.

I want to take the SmartEiffel approach and separate the source code in few clusters. A cluster is a collection of prototypes. And the prototypes in a cluster can only use the prototype of the same cluster or the prototypes of imported clusters. This solve the above dependancy problem.

A cluster is a directory that contain prototypes in .li files and subdirectories. If a subdirectory do not contain .li files, the sub-subdirectories are not recursively searched. A cluster can import another cluster using a cluster file ending with .cli.

An example of .cli file is as follows:

Section Header

  - name := Cluster LIBFOO;

  - path := ("libfoo-3.14", "../libfoo");

The search paths can be:

  • relative to the .cli file if it starts with .
  • relative to LYSAAC_PATH directories otherwise

LISAAC_PATH defaults to $XDG_DATA_HOME/lysaac/lib:/usr/local/share/lysaac/lib:/usr/share/lysaac/lib.

The search paths would then be for this example:

  • $XDG_DATA_HOME/lysaac/lib/libfoo-3.14
  • /usr/local/share/lysaac/lib/libfoo-3.14
  • /usr/share/lysaac/lib/libfoo-3.14
  • ../libfoo

The parser for these files is being written. Then you can see the complete hierarchy of the project:

$ lysaac src
◆ Root Cluster
│ Cluster in: src
├─◆ LIB (src/lib.cli)
│ │ Cluster in: lib
│ ├─◆ STDLIB (lib/stdlib.cli)
│ │ │ Cluster in: /home/mildred/.local/share/lysaac/lib/stdlib
│ │ ├─◇ STRING (...)
│ │ ├─◇ ABSTRACT_STRING (...)
│ │ ╰─◇ ...
│ ├─◇ LIBC (lib/libc.li)
│ ╰─◇ CSTRING (lib/cstring.li)
├─◇ PARSER (src/parser.li)
├─◇ CLUSTER_ITEM (src/cluster_item.li)
├─◇ ITM_STYLE (src/itm_style.li)
├─◇ LYSAAC (src/lysaac.li)
├─◇ ITM_AFFECT (src/itm_affect.li)
├─◇ ANY (src/any.li)
├─◇ PARSER_CLI (src/parser_cli.li)
╰─◇ CLUSTER (src/cluster.li)

Now, each item in a cluster can be public or private. Public items are available to the users of the clusters whereas private items are restricted to members of the same cluster. To declare a private item, just say:

Section Header

  + name := Private PROTOTYPE;

or

Section Header

  - name := Private Cluster LIBTOTO;

If you want to declare a whole bunch of prototypes private to your cluster, just include them in a private cluster. To do so, you'll need the following files:

  • cluster/my_private_prototypes.cli:

    Section Header
    
      - name := Private Cluster MY_PRIVATE_PROTOTYPES;
    
      - path := "./deps/my_private_prototypes";
      // makes the cluster relative to the .cli file
      // use a deps additional directory to avoid the current cluster to
      // look into deps/my_private_prototypes. deps should not contain any
      // files, just subdirectories.
    
  • cluster/deps/my_private_prototypes/private_proto.li:

    Section Header
    
      + name := Public Prototype PRIVATE_PROTO;
    

Wed 15 Sep 2010, 02:08 PM comp dev fr lisaac

Ce post fait suite à mon précédent post sur l'aliasing des chaînes de caractères. Je disais que j'avais presque fini, mais ce n'est sans doute pas le cas.

J'ai de gros problèmes de performance.

Rappelons ce que je cherche à faire :

  • Ajouter une ou des primitives pour accéder aux STRING_CONSTANT compilées
  • Ajouter le support de l'aliasing des chaînes de caractères dans la bibliothèque standard
  • Remplacer l'aliasing des chaînes du compilateur (ALIAS_STR) par l'aliasing fait dans la bibliothèque standard.

La première étape est réalisée grâce aux listes chaînées. Les primitives de compilation ont été ajoutées.

La modification de la bibliothèque standard est bien avancée. Comme cela nécessite une nouvelle primitive du compilateur, il à fallu bootstrapper le compilateur à nouveau.

Et maintenant, vient la dernière étape: supprimer l'aliasing du compilateur. Pour cela, je commence par me compiler un compilateur Lisaac avec la nouvelle primitive et son support dans la bibliothèque standard. Ce compilateur est extrêmement lent (2h45 de bootstrap). En effet, l'aliasing est réalisé en double. Puis je supprime le support de l'aliasing du compilateur et j'ai un gros problème:

  • le compilateur en mode optimisé plante
  • le compilateur en mode optimisé compilé avec gcc en mode debug ne plante pas
  • le compilateur en mode debug ne plante pas

Je ne sais plus quoi faire ... et j'en suis là pour le moment. Je me demande si c'est -O2 ou -fomit-frame-pointer qui pose problème.

Pour les problèmes d'optimisation, je pense savoir un peu ce qui cloche. Voici comment j'ai implémenté l'aliasing dans STRING_CONSTANT:

Section Public

  - first_string :STRING_CONSTANT <- (first_string := `100`);
  // Il s'agit ici du pointeur de tête vers la liste chaînée au complet

  + next_string :STRING_CONSTANT := NULL;
  // Pointeur suivant de chaque STRING_CONSTANT vers la suivante (initialisé
  // par le compilateur)

Section Private

  //
  // Aliasing String.
  //

  - bucket:SET(ABSTRACT_STRING) <-
  // Ensemble de toutes les chaînes. HASHED_SET est tout de même bien plus
  // performant qu'une liste chaînée non ordonnée.
  ( + sc :STRING_CONSTANT;
    bucket := HASHED_SET(ABSTRACT_STRING).create;

    sc := first_string;
    {(sc != STRING_CONSTANT) && {sc != NULL}}.while_do {
      bucket.fast_add sc;
      sc := sc.next_string;
    };

    bucket
  );

  - list_insert <-
  // On met quand même à jour la liste chaînée, ça peut servir.
  [
    -? { first_string != Self };
  ]
  (
    bucket.fast_add Self;
    next_string  := first_string;
    first_string := Self;
  );

Section Public

  - new_intern p:NATIVE_ARRAY(CHARACTER) count nb_char:INTEGER :SELF<-
  // Do not use directly. WARNING: Use by c_string and c_argument (COMMAND_LINE).
  ( + sc, result:STRING_CONSTANT;

    sc := clone;
    sc.set_storage p count nb_char;
    result ?= bucket.reference_at sc;
    (result = NULL).if {
      result := sc;
      result.list_insert;
    };

    result
  );

En fait, j'ai pas du tout assuré !!!

J'ai deux slots code (<-) qui sont réinitialisés en données (:=). Cela veut dire qu'à chaque fois que le slot est appelé, le compilateur va intéroger un slot invisible auto-généré pour savoir si c'est la donnée qu'on veut ou le code.

En plus j'ai l'impression que la version de Lisaac que j'utilise pour compiler (celle qui met 2h45) avait peut être une ancienne version de la lib ou l'aliasing était fait par la liste chaînée non triée au lieu de HASHED_SET. Bref, j'ai tout à revoir.

Mildred

Page: