My photo

Mildred's Website

Tags:
My avatar

GoogleTalk, Jabber, XMPP address:
mildred@jabber.fr


GPG Public Key
(Fingerprint 197C A7E6 645B 4299 6D37 684B 6F9D A8D6 9A7D 2E2B)

Category: comp

Articles from 17 to 26

Tue 28 Feb 2012, 11:07 AM by Mildred Ki'Lya comp en web

I just moved the site to webgen. For the occasion, forked the project on github and improved a lot of things. As webgen doesn't seem to be alive, I think I'll be the one maintaining it in the future.

I took the occasion to move my website over to my own web server, hosted at home. If I ever find it unsuitable, I can very easily host my website elsewhere as it is just a bunch of static pages.

More to come ...

Continue Reading ...

Wed 19 Oct 2011, 10:21 AM by Mildred Ki'Lya comp dev en privacy web

What is this all about: web privacy. We are tracked everywhere, and i'd like to help if possible. So, let's design a web browser that for once respects your privacy.

Main features:

  • Each website has its own cookie jar, its own cache and its own HTML5 local storage
  • History related css attributes are disabled
  • External plugins are only enabled on demand
  • Support for Tor/I2P is enabled by default
  • You have complete control over who receives what information
  • Let's you control in the settings if you want to allow referrers or not.
  • The contextual menu let's you open links anonymously (no referrer, anonymous session)

The browser is bundled with a particular UI that let you control everything during your browsing session. it is non intrusive and makes the best choice by default. I am thinking of a notification bar that shows at the bottom. I noticed that this place is not intrusive when I realized that the search bar in Firefox was most of the time open, even if most of the time I didn't use it.

First, let's define a session:

  • A session can be used my more than one domain at the same time.
  • A session is associated to a specific cache storage
  • A session is associated to a specific HTML5 storage
  • A session is associated to a specific cookie jar
  • A session acn be closed. When it is closed, session cookies are deleted from the session
  • A session can be reopened. All long lasting cookies, cache, HTML5 storage and such is then used again.
  • A session can be anonymous. In such a case, the session is deleted completely when it is closed.
  • A session is associated to none, one or more domains. These domains are the domains the end user can see in the address bar, not the sub items in the page.

Sessions are like Firefox profiles. If you iopen a new session, it's like you opened a new Firefox profile you just created. Because people will never create a different Firefox profile for each site.

If we want to protect privacy, when a link is opened, a new session should be created each time. To make it usable to browse web sites, it is made possible to share sessions in specific cases. Let's define the cases where it might be intelligent to share a profile:

  • You click a link or submit a form and expect to still be logged-in in the site you are viewing. You don't care if you follow a link to an external page.

    User Interface: If the link matches one of the domains of the session, then keep the session. No UI. If the user wanted a new session, the "Open anonymously" entry in the context menu exists. A button on the toolbar might be available to enter a state where we always want to open links anonymously.

    If the link points to another domain, then open the link in a new session unless "Open in the same session" was specified in the context menu. The UI contains:

    We Protected your privacy by separating <domain of the new site> from
    the site you were visiting previously (<domain of the previous site>).
    
    Choices: [ (1) Create a new anonymous session          | ▼ ]
             | (2) Continue session from <previous domain> |
             | (3) Use a previous session for <new domain> |
             | (4) Use session from bookmark "<name>"      |
    

    The first choice is considered the default and the page is loaded with it. If the user chooses a new option, then the page is reloaded.

    If the user chooses (2), the page is reloaded with the previous session and the user will be asked if "Do you want and to have access to the same private information about you?". Answers are Yes, No and Always. If the answer is Always, then in the configuration, the two domains are considered one and the same.

    The choice (3) will use the most recent session for the new domain. It might be a session currently in use or a session in the history.

    There are as many (4) options as there are bookmarks for the new domain. If different bookmarks share a single session, only one bookmark is shown. This choice will load the session from the bookmark.

    If (3) and (4) are the same sessions, and there is only one bookmark (4), then the (4) option is left out.

  • You use a bookmark and expect to continue the session you had for this bookmark (webmails)

    The session is simpely stored in the bookmark. When saving a bookmark, there is an option to store the session with it or not.

    [X] Do not save any personal information with this bookmark
    
  • You open a new URL and you might want reuse a session that was opened for this URL.

    The User Interface allows you to restore the session:

    We protected your privacy by not sending any personal information to
    <domain>. If you want <domain> to receive private information, please
    select:
    
    Choices: [ Do not send private information     | ▼ ]
             | Use a previous session for <domain> |
             | Use session from bookmark "<name>"  |
    

If you can see other use cases, please comment on that.

From these use cases, I can infer three kind of sessions:

  • Live sessions, currently in use
  • Saved sessions, associated to a bookmark
  • Closed sessions in the past, accessible using history. Collected after a too long time.

Now, how to implement that? I was thinking of QtWebKit as I already worked with Qt and it's easy to work with.

  • We have a main widget: QWebView. We want to change the session when a new page is loaded. So we hook up with the signal loadStarted.
  • We prevent history related CSS rules by implementing QWebHistoryInterface, more specifically, we store the history necessary to implement QWebHistoryInterface in the session.
  • We change the cache by implementing QAbstractNetworkCache and setting it using view->page()->networkAccessManager()->setCache(...)
  • We change the cookie jar by implementing QNetworkCookieJar and setting it using view->page()->networkAccessManager()->setCookieJar(...)
  • Change the local storage path using a directory dedicated for the session and using view->page()->settings()->setLocalStoragePath(QString)

After all that, we'll have to inspect the resulting browser to determine if there are still areas where we fail at protecting privacy.

Wed 28 Sep 2011, 10:42 AM by Mildred comp en privacy

Here is my Bug 660340. I created it after looking at the recent facebook 'enhancements' that makes privacy even more precious (article in French).

We need to quickly find a way to preserve our privacy on the Internet.

Hi,

With the recent threats of the various big internet companies on our privacy, it would be a great enhancement if epiphany allowed to have separate navigation contexts (cookies, HTML5 storage, cache) at will, and easily.

Some companies, especially facebook, and I suppose Google could do that as well, can use all kind of methods to track a user usiquely. using cache, HTML5 storage or cookies. I wonder if they can use the cache as well, but I heard it could be prevented. Firefox does.

One solution to counter these privacy threats is to have a different browser, or different browser profile, for each of the web sites we load. This is however very inconvenient, and it should made easily possible.

First let define the concept of session. A session is almost like a separate instance of the browser. It share bookmark and preferences with other session, but have separate cache, separate set of cookies and separate HTML5 DOM storage.

I imagine the following behaviour, based on the document.domain of the toplevel document:

  • If a page is loaded without referrer, and the domain is not associated with an existing session, start a new session for that domain

  • If a page is loaded without referrer, and the domain is a domain that is already associated with an existing session, then prompt non intrusively:

    "You already opened a session on example.com."
    Choices: [  Start a new session    ▼ ] [Use this session]
             [[ New anonymous session   ]]
             [  Replace existing session ]
    

    If the session is started anonymously, it would not be considered for reuse

  • If a page is loaded using a link from an existing window/tab, and the domain is the same, then share the session

  • If a page is loaded using a link from an existing window/tab, and the domain is NOT the same and, then a non intrusive message is displayed:

    "You are now visiting example2.com. Do you want to continue your session
    from example1.com?"
    Choices: [  Start a new session    ▼ ] [Share previous session]
    

    The "Start a new session" dropdown menu changes if the example2.com is already associated with a session or not. If example2.com is associated with a session:

     [[ New anonymous session            ]]
     [  Replace example2.com session      ]
     [  Use existing example2.com session ]
    

    If example2.com is not associated with a session:

     [[ New anonymous session    ]]
     [  New example2.com session  ]
    

The choices in [[xxx]] (as opposed to [xxx]) is the most privacy enhancing one, and would be the default if the user choose in the preferences

[x] allow me not to be tracked

The messages are non intrusive, they can be displayed as a banner on top of the page. The page is first loaded with the default choice, and if the user decides to use the other choice, the page will be reloaded accordingly (or the session will be reassigned).

This setting can traditionally be used to set the do not track header

Every settings should have a setting "do not prompt me again" that could be reset at some point.

About embedded content: Because toplevel pages do not share the same session (toplevel page opened at example.com have a different session than toplevel page opened at blowmyprivacy.org), if a page from example.com have embedded content from blowmyprivacy.org, the embedded content would not be able to track the user, except within the example.com website.

It is possible to imagine global settings that would hide some complexity:

[ ] When I load a page, always associate it with its existing session.
[ ] When I switch website, always reuse the existing session of the new
    website.

This user interface might seem complex at first, but it is far less complex than letting the user deal with different browser profiles by hand. Unfortunately, I don't think we can abstract privacy that easily. Keep in mind that all of these settings would be enabled only if the user choose to enable privacy settings.

I am ready to contribute to the implementation of this highly important feature (important for our future and our privacy). Do you think an extension might be able to do all of that, or do you think the browser code should be modified?

Further possible enhancements:

  • Add another choice for anonymous sessions using Tor (or I2P)

  • Add the possibility to have multiple session registered with the same domain. This would enable the user to have different profiles for the same website.

Fri 05 Aug 2011, 10:35 AM by Shanti comp en html web

First postulate: HTML was designed as a stateless protocol

Context: web sites need to maintain a context (or state) to track the client. This is required by the log-in procedures the various websites have. It is also useful to track the user in a web store, to know which items the user wants to buy. In fact, it is requires almost everywhere.

The first solution to be thrown out for this problem are the cookies. People didn't like cookies but now, everyone accepts them. Nothing works without cookies. Why did people dislike cookies back then? They liked their provacy and cookies makes it possible to track the user. Through advertisement networks, the advertiser known exactly which website the user visited. And it is still the case now. What changed is that the users got tried to fight cookies and have every website break, and they got used to it.

People got used to being tracked just as people are used to be watched by video cameras in the street and people are used to get tracked by the government and big companies and banks.

Cookies are a great way to track prople, all because HTTP didn't include session management. The way Google track you is very simple. Google Analytics puts a cookie on your computer and each time you access the Google Server, they know it's the same person. Google is everywhere:

  • Many web sites are using Google APIs, or the jQuery library at Google.
  • Many web sites ask Google to track their users to know how many prople visit their page.
  • Google makes advertisement.
  • Youtube, Blogger, Picasa and others are owned by Google

With this alone, Google is found on almost every page. If you have an account at Google (YouTube, Picase, Gmail, Blogger, Android or other), they can even give a name or an e-mail address to all of these information.

Google motto is Don't be Evil, they are perhaps not evil but can they become evil? Yes.

Whatever, my dream HTTP 2.0 protocol would include of course push support like WebSockets, but more importantly: session management. How should this be done?

HTTP and Session Management

When the server needs a session, it initiates the session by giving a session token to the client. The client needs to protect this token from being stolen and should display that a session is in pogress for this website. It could appear on the URL bar for example. The client could close the session at any moment.

With the token, the server provides its validity scope. Domains, subdomains, path. Only the resources in the session scope will receive the tocken back. If for example http://example.com starts a session at example.com but have an <iframe> that includes facebook. Facebook won't receive the session token. If Facebook wants to start a session (because the user wants to log-in) it will start a second session.

Session cannot escape the page. If you have two tabs open with facebook in each tab (either full page or embedded), the two facebook instances don't share the same session, unless the user explicitely allowed this. For instance, when Facebook starts a session, the browser could tell the user that Facebook already have an existing session and the user would be free to choose between the new session and the existing one.

How does this solve XSS

XSS is when a website you don't trust access the session of a website you trust, and steal it. At least I think so.

With this kind of session management, the session couldn't possibly be stolen. Suppose that the non-trusted site makes an XmlHttpRequest to gmail.com. If cross-domain wasn't forbidden, any web-site could read your mails.

With the new session management, if the untrusted site makes a request to gmail.com, gmail.com session wouldn't be available and the login page would be returned instead of the list of e-mails. If the non trusted website tries to log-in, you would be prompted to associate the Gmail session with the site you don't trust. If you aren't completely idion, you wouldn't allow the online pharmacy to connect to Gmail.

Extra

What is known about you? Let's take an average person that uses her credit card, have and Android phone with Gmail, uses Facebook:

  • All your relationships are known by Google (Gmail, Google+) and Facebook
  • All your interests are known by Google and Facebook (Ad Sense track which website you visit and Facebook have a huge profile on you)
  • All your posessions are known to your bank
  • Your photograph is known by Google and Facebook (people probably took a photo of you and placed it on their Android phone synchronized with Google)
  • Your location is known (using your Android phone, your credit card, or your RFID card you use for public transportation)
  • ...

If you ever want to keep private, it is becoming very difficult.

Tue 28 Jun 2011, 12:51 PM comp en lisaac lysaac misc

Lysaac is my reimplementation of the lisaac compiler. Until now, it wasn't very interesting to look at, but recently, I pushed a few interesting commits:

  • you now have variables
    • the default value is initialized correctly
    • the read works
    • the write works
  • you also have BLOCKs (sorry, no upvalues for now)

This may looks like nothing, but under the hood, the infrastructure is almost completely there.

Next thing to come: inheritance and error reporting.

Then perhaps, syntax improvements like keyword messages and later: operators. For now, I want the basic functionnality working well.

If you want to play with it, you can. If you get an error, create a use-case and propose it as a new feature. Please use as a model the .feature files.

Tue 28 Jun 2011, 12:51 PM comp fr lisaac lysaac misc

Lysaac c'est ma réimplémentation du compilateur lisaac. Jusqu'a présent, il n'y avait pas grand chose, mais dernièrement, il y a eu des commits intéressants:

  • les variables fonctionnent
    • avec des valeurs par défaut
    • on peut les lire
    • et y écrire
  • on a aussi des BLOCKs, mais sans upvalues

Ça ne paye peut être pas de mine, mais en fait, l'infrastructure du compilo est presque complète.

Prochaines avancées: héritage et affichage des erreurs

Et peut être après: des améliorations de syntaxe (appels de slot à paramètres et bien plus tard: opérateurs). Pour le moment, je me concentre sur les choses basiques.

Si vous voulez jouer, vous pouvez. Si vous avez une erreur inattendue, créez un scénario d'utilisation et donnez le moi (préférablement sous forme de fichier .feature).

Tue 28 Jun 2011, 12:17 PM comp configuration en misc

I always wanted to manage the files in my home directory. Generally, it's a complete mess and I wanted to get things right and understand the files I had.

At first, I just created a simple shell script that maintained symbolic links of the dotfiles and dordirs of my homedirs to .local/config, sort of early XDG configuration directory. I also changed my .bashrc and later .zshrc to point to files in .local/etc/profile.d. The shell script was reading a database in .local/config/database.sh that contained the link information in the form.

The script did the following for each file declared:

  • if the file existed in the homedir but not in the database directory, it was simply moved and a link was created in its place.
  • if the file existed at both places, tell there is a conflict.
  • if the file existed in the database directory but not in the homedir, create the link

The database looks like:

link ".bashrc" "bashrc"
link ".zshrc"  "zshrc"

# First file in home directory
# Second file in .local/config

My script just defined a function link and sourced the database. But links were not easy to construct in shell. So later, I decided to rewrite it in Tcl, simply because the syntax is compatible (I love the Tcl syntax for that) and because of the wonderful file command.

Later, I improved the script that was then called fixdir to list which files were not managed, and display them. So I could either delete those files (because I don't care about them) or integrate them in the database. The script gained a clean command to automatically remove files declared as noisy.

Now, I have a slightly different problem. I have now different computers which do not have all the same configuration. At first, I synchronized everything and just used the hostname in the database to get different links depending on the machine. But now, with my computer at work, I will not synchronize all the personal configurations. i have to get to a modular approach.

And this script did not help me in tracking what programs I installed in ~/.local/{bin,share,lib}. For this, I wanted something like stow. I tried using stow, but it failed with a conflict. Then I tried using homedir. I just didn't like it because it created an ugly ~/bin instead of ~/.local/bin.

Then I realized my fixdir script looks much like homedir already, and I patched it up to make it better. And there it is.

The current version of fixdir is on github.

Let me copy the README file:

What is it?

This is my homedir package manager. Written first in shell then translated in tcl. Originally, this was just to maintain a set of symbolic links from my home directory to a directory where all important comfig files were stored. Then I decided to make it a package manager.

How to install it?

git clone git://github.com/mildred/fixdir.git fixdir
fixdir_dir="$(pwd)/fixdir"
cd
"$fixdir_dir/fixdir" install "$fixdir_dir/hpkg.tcl"

fixdir is installed in ~/.local/bin. Make sure it is in your $PATH.

How does it work

fixdir works better when you are in your target directory (homedir)

Invoke one action with a database file. The database file is a tcl script that contain all files and directories that should be linked.

What else can it do?

fixdir unknown list all files not manages by fixdir in the current directory

fixdir clean remove files declared as noisy

Bugs

  • fixdir list doesn't work when pwd != target directory

Mon 09 May 2011, 02:40 PM comp dev en

FAQ

  • What is HMP: It is a messaging protocol destined to replace e-mails.

  • Why replace e-mails: Because it is full of spam and unmaintainable. This alternative is lighter and easier to implement than a full SMTP server with SPAM management.

  • Why use HTTP: I'm not fan of putting everything over HTTP but it has its advantages:

    • It has a security layer (HTTPS)
    • It is (relatively) simple and implemented everywhere
    • It manages content-types and different types of requests
    • It is extensible
    • It goes easily through proxys and NATs
    • It allows multiplexing many different resources on the same server

    In the long run, perhaps we should move away from HTTP as:

    • It is too associated with the web
    • It doesn't allow initiative from the server.

    WebSockets could be a good alternative one day.

  • How do I get my messages: Not specified, although you could possibly authenticate using a standard HTTP method to the same resource as your address and issue a GET command.

  • Does this allows a web implementation: Yes, it will need to be further specified but if the server detects a browser request (without the HMP headers) on the resource, it could issue a web-page with a form.

  • Is the message format specified: no, it needs to be. I plan on using JSON.

  • Do you plan an implementation: Yes, using probably node.js or Lua.

  • What prompted this: The Tor network doesn't have any standard messaging system. I don't believe SMTP is suited for that.

  • Why write this spec, you have no code to back this up: because I like writing specs, and it's a way for me to remind me to write the code, and to tell me how I should write it. I might not get the time to write this as soon as I want.

What is a hmp address

Scheme:

[hmp:]server[:port][/path]

Example:

hmp:gmail.com:80/user
gmail.com:80/user
gmail.com/user

domain.org/u/alicia

Translation to HTTPS resources

A HMP address can directly be translated to an HTTPS resource. The standard scheme translates to:

https://server:port/path

Message sending overview

To send a message from domain.org/alicia to users.net/~bob, the sequence is:

  • Connection to users.net:

    [1] POST https://users.net/~bob
    [1] HMP-Pingback: 235
    [1] HMP-From: domain.org/alicia
    [1] Content: message-content
    
  • users.net opens a connection to domain.org

    [2] GET https://domain.org/alicia
    [2] HMP-Pingback: 235
    [2] HMP-Method: MD5
    
  • domain.org responds to users.net

    [2] HMP-Hash: ef0167eca19bb2d4c8dfe4c3803cc204
    [2] Status: 200
    
  • users.net responds to the original sender

    [1] Status: 200
    

Headers to the POST request

The POST request is the request used to post a message. It contains two specific headers:

  • HMP-From: The address the message is sent from

  • HMP-Pingback: A sequence number that uniquely identifies the message for the sender. it needs not be unique, as long as at ont point in time, there are only one message corresponding to this ID.

Particular status codes:

  • 200 in case of success

  • 403 in case the From address could not be authenticated

From address authentication, pingpack

In order to avoid SPAM, the sender must be authenticated when the message is sent. For this reason, before accepting or rejecting the request, the server must initiate a pingback procedure to the sender.

First, the From address is converted to an HTTPS resource and a GET connection is initiated. The specific request-headers are:

  • HMP-Pingback: the pingback sequence number from the previous request

  • HMP-Method: method for verifying the originating message. The only specified method is "MD5"

MD5 Method

In case the message is recognized, the from server responds with the following header:

  • HMP-Hash: MD5 hash of the content of the message identified by the pingback identifier

The status code can be:

  • 200 in case the message was recognized

  • 404 in case the message was not found

If the MD5 sum corresponds to the message received and a success code was given, the from is verified and the message can be sent.

Mon 02 May 2011, 12:04 PM comp dev en lisaac lysaac

This is great: Here is the source files:

c/cstring.li

Section Header

  + name := Reference CSTRING;

  - role := String; // const char*
  - type := Integer 8;

c/main.li

Section Header

  + name := MAIN;

Section Public

  - puts str:CSTRING <- External `puts`;

  - main <-
  (
    puts "Hello World";
  );

You type lysaac compile c >c.bc and you get the following LLVM assembly code:

c.bc

@0 = private constant [12 x i8] c"Hello World\00"


declare void @puts (i8*)

define void @main () {
  %1 = getelementptr [12 x i8]* @0, i32 0, i32 0
  tail call void @puts(i8* %1)
  ret void
}

And you can execute it using the standard LLVM tools:

$ llvm-as < c.bc | lli
Hello World
$

Isn't that great ?

Fri 29 Apr 2011, 09:56 AM comp dev lisaac lysaac

Look at this:

object.li

Section Header

  + name := Singleton NULL;

Section Public

  - is_null :BOOLEAN <- FALSE;

null.li

Section Header

  + name := Singleton NULL;

Section Inherit

  - parent :OBJECT := OBJECT;

Section Public

  - is_null :BOOLEAN <- TRUE;

union.li

Section Header

  + name := UNION;

Section Inherit

  - parent :OBJECT := OBJECT;

union.1.li

Section Header

  + name := Expanded UNION(E);

  - import := E;

Section Inherit

  - parent :UNION := UNION;

Section Public

  + element:E;
  - set_element e:E <- (element := e;);

  - when o:T do blc:{o:T;} <-
  (
    (o = E).if {
      blc.value element;
    };
  );

  - from_e e:E :SELF <-
  ( + res :SELF;
    res := clone;
    res.set_element e;
    res
  );

union.2.li

Section Header

  + name := Expanded UNION(E, F...);

Section Inherit

  + parent_e :UNION(E);
  + parent_next :UNION(F...);

Section Public

  - when o:T do blc:{o:T;} <-
  (
    (o = E).if {
      parent_e.when o do blc;
    } else {
      parent_next.when o do blc;
    };
  );

use.li

Section Header

  + name := USE;

Section Public

  - accept_object_or_null obj:UNION(USE,NULL) <-
  (
    obj
    .when NULL do { o:NULL;
      ? { o.is_null };
    }
    .when USE do { o:USE;
      ? { o.is_null.not };
    };
  );
Page: