HTML 2.0: Why I think HTML is broken
First postulate: HTML was designed as a stateless protocol
Context: web sites need to maintain a context (or state) to track the client. This is required by the log-in procedures the various websites have. It is also useful to track the user in a web store, to know which items the user wants to buy. In fact, it is requires almost everywhere.
The first solution to be thrown out for this problem are the cookies. People didn't like cookies but now, everyone accepts them. Nothing works without cookies. Why did people dislike cookies back then? They liked their provacy and cookies makes it possible to track the user. Through advertisement networks, the advertiser known exactly which website the user visited. And it is still the case now. What changed is that the users got tried to fight cookies and have every website break, and they got used to it.
People got used to being tracked just as people are used to be watched by video cameras in the street and people are used to get tracked by the government and big companies and banks.
Cookies are a great way to track prople, all because HTTP didn't include session management. The way Google track you is very simple. Google Analytics puts a cookie on your computer and each time you access the Google Server, they know it's the same person. Google is everywhere:
- Many web sites are using Google APIs, or the jQuery library at Google.
- Many web sites ask Google to track their users to know how many prople visit their page.
- Google makes advertisement.
- Youtube, Blogger, Picasa and others are owned by Google
With this alone, Google is found on almost every page. If you have an account at Google (YouTube, Picase, Gmail, Blogger, Android or other), they can even give a name or an e-mail address to all of these information.
Google motto is Don't be Evil, they are perhaps not evil but can they become evil? Yes.
Whatever, my dream HTTP 2.0 protocol would include of course push support like WebSockets, but more importantly: session management. How should this be done?
HTTP and Session Management
When the server needs a session, it initiates the session by giving a session token to the client. The client needs to protect this token from being stolen and should display that a session is in pogress for this website. It could appear on the URL bar for example. The client could close the session at any moment.
With the token, the server provides its validity scope. Domains, subdomains,
path. Only the resources in the session scope will receive the tocken back. If
http://example.com starts a session at
example.com but have an
<iframe> that includes facebook. Facebook won't receive the session token. If
Facebook wants to start a session (because the user wants to log-in) it will
start a second session.
Session cannot escape the page. If you have two tabs open with facebook in each tab (either full page or embedded), the two facebook instances don't share the same session, unless the user explicitely allowed this. For instance, when Facebook starts a session, the browser could tell the user that Facebook already have an existing session and the user would be free to choose between the new session and the existing one.
How does this solve XSS
XSS is when a website you don't trust access the session of a website you trust, and steal it. At least I think so.
With this kind of session management, the session couldn't possibly be stolen. Suppose that the non-trusted site makes an XmlHttpRequest to gmail.com. If cross-domain wasn't forbidden, any web-site could read your mails.
With the new session management, if the untrusted site makes a request to gmail.com, gmail.com session wouldn't be available and the login page would be returned instead of the list of e-mails. If the non trusted website tries to log-in, you would be prompted to associate the Gmail session with the site you don't trust. If you aren't completely idion, you wouldn't allow the online pharmacy to connect to Gmail.
What is known about you? Let's take an average person that uses her credit card, have and Android phone with Gmail, uses Facebook:
- All your relationships are known by Google (Gmail, Google+) and Facebook
- All your interests are known by Google and Facebook (Ad Sense track which website you visit and Facebook have a huge profile on you)
- All your posessions are known to your bank
- Your photograph is known by Google and Facebook (people probably took a photo of you and placed it on their Android phone synchronized with Google)
- Your location is known (using your Android phone, your credit card, or your RFID card you use for public transportation)
If you ever want to keep private, it is becoming very difficult.