Wednesday, March 23, 2011
How Google Analytics Works
Google is forthcoming with details of how analytics works, but they don't present it in a nerd friendly way. The data is wrapped up inside of videos instead of dumped as text. I may have been able learn this by watching the videos, but it was more fun figure it out by playing with it.
Summary of Google Analytics' Design
- All user data is stored in cookies on the visited site.
- No cookies are set on any Google Analytics Domain.
- Each user has a unique ID on each domain they visit.
- User data is transmitted back to Google Analytics in the query string when requesting a 1x1 GIF.
UML Sequence Diagram of Google Analytics' Design
Analyzing the pieces of Google Analytics Data
- Definitions of each of the utm* cookies
- Descriptions of the pieces of data in query string for utm.gif
Monday, March 21, 2011
Weekend Hacking Failure: .NET parser for YAML in ANTLR
Useless information: I tried to write a decent .NET parser for YAML over the weekend. I failed, but learned a few things
- I didn't have a good handle on what YAML was. For example, I didn't realize that whitespace was significant.
- YAML is a full superset of JSON.
- YAML defines all kinds of stuff to support strongly typed serialization that I didn't want and didn't need.
- The grammar presented in the YAML standard isn't particularly well suited for copying because it ends up with all sorts of useless rules that the lexer or parser doesn't like and you have to manually eliminate.
- One of the main YAML guys apparently tried to built a parser for YAML on top of ANTLR (which is what I was trying to use) a few years back. It looks like he gave up fairly early on.
- I probably should've started by copying SnakeYaml, which is a pretty good Java implementation.
Tuesday, March 08, 2011
What I learned about git today: reverting, ignoring, diffing, and vimming
I learned a few things about git today:
- I wanted to see what I'd added to the index, but not committed. There may well be a better way to do it, but here are the two commands I discovered and aliased:
di = diff-index --color=auto --cached --patch --relative HEAD # shows the actual differences dfi = diff-index --color=auto --cached --name-only --relative HEAD # shows just the files which are changed
- I wanted to remove changes from index. The Git Book has an excellent page on all manner of undoing. Specifically, I needed the "git reset" command to revert staged but uncommitted changes.
- Git kept telling listing files whose mode had been changes in the list of diffs. While those are technically diffs, I didn't care. I found the solution on Stack Overflow. "git config core.filemode false" changes the config with a repo to ignore file mode changes.
- I've long enjoyed seeing side-by-side diffs of pending CVS changes in vim. I've been using the CVSMenu plugin for as long as I can remember. Of course, it doesn't work with git, but there's an even cooler plugin for vim these days, VCS Command, which gives you similar functionality for many version control systems, including git.