I’ve completed 3 weeks of lectures with the brand new class I was assigned at IIT. It’s a first year, second semester introductory course on Database Systems. The students seem to be a reasonably decent bunch, and this time I’ve got slideshows again, since the course material is developed at Westminster.

Addressing the post title, I just finished the TOEFL iBT test today. I have a tendency to be indirect in my communication (there I go again…) and I had slight time overruns on the spoken section, but overall it was easy enough, even though I did no preparation beside the free sampler. I would however very strongly encourage any test-takers to at least go through that, to get a feel for what the exam’s going to be – it’s very easy to get lost in the screens in the middle of exam anxiety. My microphone test failed 3 times too, and finally worked only when I held it right over my mouth, and I hope the test responses were recorded well enough.

My passport has sadly been held up, and therefore I have so far been unable to register for the IELTS exam. Need to get that over with, as soon as possible too!

Advertisements

Sharepoint Portal Server 2003 is my first memory of an HTTP application (that sounds more appropriate than “Web application” 🙂 ) that allowed web content to be dragged and dropped around. Considering that SPS2003 only worked over IE, Google IG is definitely a good step ahead, at least in my books.

Then again, I came across this post on CodeProject and the guy was right – his site leaves you speachless! It just has to be seen, so head over there right now. Be warned though, it ran ok on IE7, but failed on both IE6 and Firefox.

Hunting around for a disposable e-mail address for a site registration, I came across Mailinator – a no-registration, no-login, easy to use service. At any crappy site which requests an e-mail address to ‘verify’ you, simply enter <anything>@mailinator.com, and a mailbox is automatically created, which can hold e-mails for ‘3-4 hours’. Checking it is a snap, with no login, and just the entry of the e-mail address.

Alternatively, each home page visit suggests a brand new, pseudorandom, 14-character mail alias. I thought the idea was really neat… I have come across SpamGourmet before, but for quick and dirty do-it-and-forget-it type of things, I would much rather prefer Mailinator. Of course, SpamGourmet is more feature laden and configurable so it’s no doubt a very nice tool.

The guy who created Mailinator blogs about his creation, and is a very interesting read. About.com mentioned both of these in an article too.

I came across a post about searching Google for MP3 files. Even though it didn’t do a whole lot, it seemed kind of interesting, and I checked out how Firefox’s search plugins are done, and created a new search plugin (and added support for OGG files too 🙂 ).

It was actually very easy to do, where all I did was copy the google.xml file from <FIREFOX_INSTALL_DIR>\searchplugins\ and modify the search\param value. Didn’t feel like “enough work” was done, so I extracted the image from the XML file (courtesy of this online Base64 encoder/decoder) and replaced with my own. Once the file is placed in the searchplugins directory, restarting Firefox loads it up.

Give it a shot. Alternatively, download my file.

Since sometime around June last year, I’ve been working with ANTLR at work. Even though I haven’t really had any formal education on Compiler Theory, this certainly pushed me to the deep end and there was a lot to learn.

Everything went fine till we started to write a parser for PL\SQL somewhere in August, and that brings me to my rant:
in PL\SQL, apparently records can have columns named “type”. Variables can be named “ref”. Cursors can be named “count”. And the list goes on. I personally think that
(a). Any language that allows keywords to be used as variables is fundamentally flawed and leads to confusing code
(b). Any programmer who re-uses keywords to create variable names is retarded violates basic software engineering principles.

Getting the actual parser running was not too difficult. ANTLR’s Grammar page had a few PL\SQL grammars but none worked with my files. BNF Web was very interesting and I spent a couple of days visiting all of their pages and copying their BNF, but that didn’t work either. Then I actually started going through my workplace files and created a grammar from scratch. That turned out to be highly ambiguous, and I re-did it, left-factoring symbols. This final one worked.

The real headache came when I started to parse the files for testing the grammar and saw all of the above “unconventional” uses. In the end, I gave up trying to fix my parser, and decided to try to make the parser context-sensitive.

Formal definitions aside, my idea of a context-sensitive language is one that, as I pointed out above, recognizes that “ref” is a keyword only when used along with “REF CURSOR” and so on. I still think that non-context-sensitive languages (and I like my languages strongly typed too) are easier to develop on. As can be imagined, I do not like JavaScript and I loathe PL\SQL.

I came across some article that suggested a few approaches, and I either failed to understand them properly, or I simply could not get the results they promised.

I ended up trying at least 5 different approaches on getting the parser to recognize context. Syntactic predicates to override testLiteralsTable() required tight integration between lexer and parser. Overriding testLiteralsTable itself didn’t work, as it required lookahead to work, and this advanced the lexer, overwriting the text to be resolved. Parsing optimistically as a keyword, and rewinding on a parser exceptionand trying again as an identifier felt promising, but there was no way to re-invoke the entire parser chain, and such a re-invocation could be at any point of the call stack.

Finally, something worked in a limited situation. I wrote an override for match(), and set a flag, clearing it immediately before returning. Now, when the grammar expected “BULK COLLECT” the generated parser simply calls match(LITERAL_collect) immediately after matching BULK. A flag is set, the lexer checks the flag to know it’s a keyword and flag is cleared after matching. At any other location of code, when a COLLECT is met, the flag would be off, and therefore it must be an identifier.

That sounds nice, but didn’t work very well either. The reason is that many keywords are optional and the majority of matches are done after a lookahead. So the problem was resolving between a keyword and an identifier during lookahead and buffer fills.

Overriding the filter (TokenStreamHiddenTokenFilter, to preserve whitespace) to lookahead once again resulted in the lexer advancing. Buffering tokens was also very tricky.

Finally, I took the easy way out. I wrote a second very simple lexer/parser combination that would simply recognize a stream of keywords, identifiers and literals, and not try to recognize a complex structure in it. The token types are then routed to a list and subsequently to an array, and then the real parsing begins. The real parser now can look ahead as well as back to the boundaries of the file, and whenever we determine that a keyword token is actually an identifier, the array is updated at the corresponding location.

I have just scheduled a complete bulk parse of 5023 PL\SQL files, totalling 180MB in size. The parser has just chewed through 4892 of those (169MB) and that’s in only one day of testing and fixes. It has taken a little over an hour, compared to about 2 1/2 hours it took with my initial “keywords everywhere” parser. The overheads include the repeated invocation of the parser executable, and the double-pass parsing mechanism, but the results are certainly very promising.

My WordPress Blog appears as the 2nd result in a Google search for “rukmal”. Woo hoo!

<sinister_plan>Next taget: getting #1</sinister_plan>

Searches for “rukmal fernando” or “rukmal blog” or “rukmal fernando blog” all point back to me on the first few links, so that’s not too bad. The last one incidentally has only a couple of links to me, probably because I haven’t used my full name (i.e.: “Rukmal Fernando”) in my blog.

Ah.. the joys of shameless self-exaltation!

p.s.: Ok, I admit it… while I’m not going overboard to fool PageRank (1|2) those links were placed there in the vain hope that it will bump up my rank just a tad little bit… now that’s not evil, is it? 🙂

My dad was tring to download a very interesting video to demo at his Unit: Quantas CEO Geoff Dixon on CNN’s Boardroom, speaking about, among other things, his lack of a paper ‘degree’.

While hunting around, I came across KeepVid (via CNET), a web based tool that allows the downloading of streaming video. I tried a few YouTube videos and it works very nicely. Neato!

I couldn’t actually get it to download the CNN video, but for that dad found another tool: Replay AV 8. The demo version only allowed capturing 5 minutes or 5 MB, whichever came first (effectively cutting off the bit dad was after – the bit about not having a degree!), but it worked really well. The learning curve was a bit steep, but the product appeared solid, and well worth the 50$ price tag.

The CNET article has more info on the topic.

Incidentally, this is my first post using Windows Live Writer. So far it looks pretty cool. Thanks, Janaka, for the heads up!

The 12 Days of Indian Christmas is perhaps the only Indian media product which I thought was _really_ awesome (next to maybe Russel Peters, even though he wasn’t born there).

Just couldn’t help blogging about it.

After much contemplation, I realized that I simply had to have a real home page, and I tried my hand at GeoCities ending up with something like this. Not the best of formatting, but at least it was up.

Afterward, I somehow remembered about my WordPress blog, and came back here. Had to reset the password too, since I couldn’t remember my old one – it’s been that long 😦

When I tried WP’s page creation, I immediately loved it! Page editors were good, I could upload a lot of files (exactly how much, I haven’t figured out yet…), there’s no bandwidth cap like at GeoCities, and I like the clean, neat CSS I’m using now.

So, my home page is now officially over here, and I’m aiming to formally move my blogging here from my older one.

Who am I to go beyond K&R? So here goes:

hello, world!

Exactly as they wrote it in their book.

About this blog

The occasional, seemingly random things that fire in my head

My recent tweets

December 2017
M T W T F S S
« Jul    
 123
45678910
11121314151617
18192021222324
25262728293031