Thursday, September 06, 2007

A Simple Lexical Analyzer .... Enjoy.

I got bored and wrote a simple lexical analyzer and thought about walking through explaining it but in all honesty its rather self explanatory especially with the inclusion of the diagrams I made and the BNF grammar I am supplying to go with the code snippet.

The grammar simply parses words or phrases in a file with white space as a delimiter and requiring that all words or phrases start with a letter and are followed by any combination of letters and digits.

BNF:


G[<Word>]

<Word> ::= <Letter> | <Letter> <LetterDigit>
<LetterDigit> ::= <Letter> | <Digit> | <LetterDigit> <Letter> | <LetterDigit> <Digit>
<Letter> ::= a | b | c | d | e | f | g | h | i | j | k | l | m | n | o | p | q | r | s | t | u | v | w | x | y | z
<Digit> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

A graphical representation of the grammar:


And the finite state automaton:

Finally, the code ... in python of course :)

Note: If you copy and paste this code it will not work because for one reason or another I can't get blogger to format the tabs correctly, please download the code from the link provided at the bottom.


UPDATE: Now featuring code indentation goodness.

###############################

#!/usr/bin/python
import sys

class SimpleLex:
"""@Author: Adam Miller - Simple lexical anyzer"""

#allowed alphabetic characters in grammar
alpha = 'abcdefghijklmnopqrstuvwxyz'

#allowed digits in grammar
digit = '0123456789'

#white space symbolic representation
wSpace = ' \n \t'

#grammar table of relations
table = [
[1,3,0],
[1,1,2],
[2,2,2],
[3,3,3],
]

#relations of type and their state index
relations = {
"alpha":0,
"digit":1,
"wSpace":2
}


def __init__(self, file_name):
"""open a file, extract contents, close file,
initialize character list, and state variable"""
self.f = open(file_name, 'r')
self.lines = self.f.readlines()
self.f.close()
self.chars = []
self.state = 0

def scan(self):
"""process the file's contents one character at a time"""
for line in self.lines:
for c in line:
# 3 is the error state
if self.state == 3:
print "Error"
self.state = 0
self.chars = []
# 2 is the success state
elif self.state == 2:
print ''.join(self.chars)
self.state = 0
self.chars = []

#let table drive state transitions
self.state = self.table[self.state][self.lex(c)]

"""print final character buffer because the loop will end
execution and not let the last success state be checked"""
print ''.join(self.chars)

def lex(self, c):
"""return the correct relational index to the type of c"""
self.chars.append(c)
if c in self.alpha:
return self.relations["alpha"]
elif c in self.digit:
return self.relations["digit"]
elif c in self.wSpace:
return self.relations["wSpace"]
else:
return 0

#use the code i just wrote
if len(sys.argv) < 2 or len(sys.argv) > 2:
sys.stderr.write("Usage: sampleLex \n")
elif len(sys.argv) == 2:
lexer = SimpleLex(sys.argv[1])
lexer.scan()
###############################

For your viewing pleasure, a small example of the "little lexer" in action:


###############################
adam@pseudogen:~$ ./simpleLex.py lexFile.txt
hello
world
from
my
ub3r
l33t
lexical
analyzer

###############################

The code, the diagrams and the file used in this example are all available here.

Wednesday, September 05, 2007

Computer Science: Through the eye of those on the other side of the looking glass

So I sit there in front of my faithful computer hacking away at what I am pretending to be homework so that I think I am actually being scholastically productive. As I sit, I receive and instant message from a good friend of mine who is a Computer Science Major at Rice university and he has a question about some Mac software's auto backup capabilities, and I answer to the best of my knowledge and the conversation comes to a halt. I simply assume he is troubleshooting something for someone and more likely than not it is for his girlfriend for she is the proud owner of an Intel MacBook, but never the less I go back to my procrastination like a good student.

A few more minutes pass by (could have been over an hour for all I know, procrastination isn't really the best time keeping activity) and I get an instant message from my friend's girlfriend stating that she has been given the task of writing an essay about "What Computer Science is" for her introductory computer science course at Rice (we will assume for all practical purposes that this course is for non-comp sci majors). And she exclaimed to me about the daunting tale that was writing said essay along with the irony that went with the fact that the word processing software she was using had crashed and she lost everything, but being the good student she is she trudged on and rewrote it from scratch. Though, this time, with a different attitude and a delightful spin on it all. Her essay was far too priceless not to publish somewhere and I got to it first! So, without further a due I present to you "What is Computer Science?" by Rachel Gittleman:

" Earlier this evening I wrote a short essay on the definition of computer science, a discipline I am familiar with only through the influence of a computer scientist I have been dating for nearly three years. I did not let him read it because I was embarrassed by how little I actually knew about the subject, and as I was in the process of saving and finally being done with the humiliating exercise, my word processor crashed and took my hard-won essay along with it. Even with the help of said computer scientist boy-friend, the essay proved to be irrecoverable.

So what is computer science? A couple of hours ago my answer was as optimistic and technical as possible for someone who really has no idea what she is talking about. I stressed the dichotomy of the discipline as both a study of computation and computational machines--theory and practical programming, math and engineering. I mentioned the relative newness of the discipline compared to others in academia (I even dropped famous names), and conjectured that views on computer science must be very different now than a mere decades ago, but that at its essence, the field is about what programmable machines can do and how to make them do it. Having my word processor crash was disheartening and, after fruitlessly trying to recover the document, I was convinced to rewrite the essay in its current form.

Computer science is new, rapidly evolving, and incredibly broad. It has the power to drastically improve the quality of our lives over incredibly short spans of time, and in the past few decades it has provided us with earth-shatteringly new tools that quickly have become integral parts of the daily routines of even the most computer- illiterate (myself sadly counted among them). Although these qualities make computer science fascinatingly current and applicable in a way that the older sciences and humanities are not, there are also inherent downsides, particularly when it comes to accountability. The constant demand for new programs and applications and the inability of most of the public (again, including myself) to understand what goes into making them naturally results in buggy programs that can crash without notice, taking hard-working students’ single-spaced essays on computer science with them. I think the public would do much more than sigh and send an error report if a mechanical engineer built something so easily broken.

I still think computer science is the study of computation and computational machines, and I still think that it is an exciting new discipline that I want to know more about (that is why I am taking an introductory course after all), but I also want to make sure to express my hope that one day all that theory and engineering and programming will eventually be accountable for making me a word processor that will not crash."

Wednesday, August 15, 2007

Why Kernels don't matter:

In the vast world of UNIX-like environments we are confronted with users who will make a statement such as "I run Linux" and honestly think that is the all encompassing term for the environment they know and love but this ultimately bothers me for the simple fact that kernels no longer matter and what these users need to understand is the its GNU that needs to receive the credit. Now please, before you spam my inbox with "Power to Tux" emails just hear me out.

Lets take into account the only thing that 95% of users are only concerned with three aspects of a kernel:
1) does it support my hardware?
2) is it secure?
3) is it stable?
And by no means in that order but lets just address them in the order listed.

Does it support my hardware? Well that all depends on which kernel you are looking at but in reality of the current state of mainstream kernel development, it is more likely than not that all of your hardware is fully supported with exception only to wireless chipsets simply because the whimsical world of open source wifi is a cruel and unusual whore. Also, a notable point about hardware is simply that if you just purchased the latest and greatest hardware it is more likely than not that the current development version of the kernel of your choice supports it even if the latest stable release does not and yes this isn't wonderful news, but good things come to those who wait. One last thing to mention about hardware support is that in the event that one open source kernel supports hardware, all the rest (mainstream) do as well.

Is it secure? In short, Yes. Every mainstream and widely accepted system kernel today has a level of security that is acceptable enough to at least a large group of followers. Does this mean "Install, run, secure"? No, of course not. A kernel can only be so secure by default and still be deemed usable. This will raise the question, "but how do I make it secure?" and the answer is ancient and often used to torment new users but simply: "Read the Manual" and this is not meant to be clever, snide, or coy but is sincere in the fact that contributors spend hours documenting software so please do not allow their efforts to go in vain.

Is it stable? That is honestly a question that I can not answer for you because the definition of the word 'stable' in the software universe appears to be translated very loosely. Is it "Microsoft stable?" Yes, and then some. I will be willing to bet that any *n?x style kernel you use will be more stable than anything The Blue Empire will wrap up and try to sell you. Is it "debian GNU stable?" That is more likely not to be the case, but at the same time many people find the debian definition of "stable" to be far too strict (I find it to be perfect, but its all a matter of opinion).

Now that all three main concerns have been addressed where to go from here? Well, how about clearing up how kernel's don't matter? OK.

When you boot into your "Linux" installation you are actually booting into "GNU/Linux" which is in fact just a port of the GNU userspace to the Linux kernel. "Is there any other port?" Yes, many. The GNU/OpenSolaris port, the debian GNU/kfreebsd port, the debian GNU/Hurd port, and the debian GNU/NetBSD port are all prime examples. These are quite possibly not all the examples, but my knowledge of the debian GNU world is more intimate than my knowledge of other GNU communities/project. If the average user was to sit down to any of these kernel ports of the GNU userspace they would more likely than not know the difference simply because "the kernel does not matter."

Now, allow me to retort against myself and attempt to offer food for thought. Kernels do matter, it matters very much so if your kernel is capable of many things including POSIX compliance for semaphore capabilities, pthreads, and other important features that are necessary for porting applications. Your kernel also matters in the sense of resource management, if it has a sufficiently fast network stack (and more importantly is that something you are worried about), if it has an efficient process scheduler, if it can multitask without excessive overhead, if it can allocate memory in a timely manner and continue to manage memory efficiently, etc... There are many aspects of a kernel that must be taken into account in selecting one that is "for you", but again .... these things do not matter to the average user and thus "Kernel's don't matter."

This has been yet another random babbling brought to you in part by a very bored 'Me' ... hope you enjoyed.

Wednesday, April 11, 2007

debian etch stable release

debian GNU/Linux has officially released their version 4.0r0 code named "etch" ... I am a few days late on posting, but basically the results of my installing it are this:

debian came, debian dominated, enough said.

..... my home desktop will once again be running debian for the foreseeable future.

Monday, April 09, 2007

Windows 2000 "mock source code"

/* Source Code Windows 2000 */

#include "win31.h"
#include "win95.h"
#include "win98.h"
#include "workst~1.h"
#include "evenmore.h"
#include "oldstuff.h"
#include "billrulz.h"
#include "monopoly.h"
#include "backdoor.h"
#define INSTALL = HARD

char make_prog_look_big(16000000);
void main()
{
while(!CRASHED)
{
display_copyright_message();
display_bill_rules_message();
do_nothing_loop();

if (first_time_installation)
{
make_100_megabyte_swapfile();
do_nothing_loop();
totally_screw_up_HPFS_file_system();
search_and_destroy_the_rest_of-OS2();
make_futile_attempt_to_damage_Linux();
disable_Netscape();
disable_RealPlayer();
disable_Lotus_Products();
hang_system();
} //if
write_something(anything);
display_copyright_message();
do_nothing_loop();
do_some_stuff();

if (still_not_crashed)
{
display_copyright_message();
do_nothing_loop();
basically_run_windows_31();
do_nothing_loop();
} // if
} //while

if (detect_cache())
disable_cache();

if (fast_cpu())
{
set_wait_states(lots);
set_mouse(speed,very_slow);
set_mouse(action,jumpy);
set_mouse(reaction,sometimes);
} //if

/* printf("Welcome to Windows 3.1"); */
/* printf("Welcome to Windows 3.11"); */
/* printf("Welcome to Windows 95"); */
/* printf("Welcome to Windows NT 3.0"); */
/* printf("Welcome to Windows 98"); */
/* printf("Welcome to Windows NT 4.0"); */
printf("Welcome to Windows 2000");

if (system_ok())
crash(to_dos_prompt)
else
system_memory = open("a:\swp0001.swp",O_CREATE);

while(something)
{
sleep(5);
get_user_input();
sleep(5);
act_on_user_input();
sleep(5);
} // while
create_general_protection_fault();

} // main

/* I saw this posted on the ubuntuforums and thought it was too funny not to post here as well.... enjoy :) */

Thursday, March 15, 2007

The case of the missing switch statement

For all my C/C++ and Java programmers out there who have some to know and love the reserved word switch would find it rather interesting to venture in the direction of a language that lacks such a statement, just as I had. I have recently divulged into the realm of what I will call "open source at its finest" by learning the Python programming language. Python is a very verbose, flexible, powerful, and object oriented interpreted programming language (details, if you want/need them, here) that I have grown to love over the past month in my adventures of learning its ways, but I recently stumbled across something that I originally thought to be an oddity: the lack of a "switch" or a "case" statement. I later discussed it with a friend of mine who simply said,"why do you need a switch statement?" and I really couldn't find a solid answer. He later went on to explain how a switch statement is simply a way to make a complex/nested if statement faster because when it compiles there is just a static jmp (for those of us sadly stuck on a x86 machine) statement in the assembly to where it needs to be instead of multiple compare operations. Thus, since python is interpreted, it would never reap the benefits of this optimization.

Though, if you are simply stuck on switch statements you can take the following Java code:

int x;
//read in a value in some form or fashion and assign it to x
switch (x) {
case 1: this.someFunction(); break;
case 2: this.someOtherFunction(); break;
}

To the following python code (Note: the python code is utilizing the power of the "dictionary" data type that is part of the language)

self.x = #read in a value in some form or fashion to assign to x
mySwitch = {
1: self.someFunction,
2: self.someOtherFunction
}
callFunct = mySwitch.get(x)
callFunct();

#Neither of these code examples have been compiled or run respectively, just coded off the top of my head ... so if there is a slight syntax error, sorry :)

So... it can "be done" but what advantage does that code have over this python code?:

if x == 1
self.someFunction()
elif x==2
self.someOtherFunction()

Well, that honestly depends on who you ask and in what respect you are asking it but for all practical purposes there really isn't an advantage or disadvantage either way its just mainly stylistic preference.

Why did I write this?
Well, I am bored ... it is spring break and this was one of the best ways I could think of to kill time and procrastinate from actually writing the lexical analyzer and parser for my compiler theory class. And yes, I am writing a compiler in python ... why? because I think it will be funny.

/me


Friday, March 09, 2007

The linux merit badge

When I started on linux, it was the black magic of the computing world. Novell hadn't bought SuSE, HP wasn't writing drivers, Dell hadn't honored its existence, and pretty much the only company actually doing anything with it was RedHat. Back in the days where Gnome 1.x and KDE 1.x reigned supreme, blackdown was the only way to get Java functional without heavy hacking, the 2.6 kernel had just recently gone stable and so many users were scared to upgrade, x86_64 wasn't even a publicly released concept on the hardware end of the spectrum much less the software world, PowerPC was still going strong over at the Apple Camp, WindowsXP was much anticipated by the Microsoft huggers, and when you said "I run linux" people automatically assumed you knew what you were talking about, and at the time there was a 95% chance that you really did.

Fast forward to today: Vista is released, Novell owns SuSE, RedHat offers Certifications, Canonical rushed in and took over the linux desktop market with throwing millions of USD at their Ubuntu distribution, HP writes native linux drivers for their printers and random peripherals, Dell is now offering linux on PCs, specialty companies are all over the place offering linux centric services and hardware, and any noob with a computer that is set to boot from cd-rom can run and, in most cases, install linux on their computer. Is this a good thing? Well, yes and no.

Yes:
More users means more support, which is a very good thing.

No:
I am annoyed with people who won't read documentation to learn things on their own, this up and coming generation of linux users wants to be spoon fed everything. When I started I was taught how to use man pages and learned that google was my best friend.

I recently started as a TA for one of my professors teaching an Abstract Data Types and programming algorithms class in Java at my University and there are two, only two, linux users in the class of roughly 25. That doesn't bother me so much, linux isn't for everyone and it still has an "under dog" aura about it, but when I started speaking to these linux using students (who run Ubuntu) I quickly realized they know nothing about linux. They don't realize that Gnome != "a version of linux", they were lost as soon as I opened a terminal window, and weren't familiar with even the trivial task of checking their screen resolution. I asked a few questions regarding the Operating System and simply got the reply,"I dunno, when I installed it just worked" and I thought to myself "That is horrible" but as the day went on I truly thought it through and realized that this is a mile stone for desktop linux. Yes, they are under informed but ask your grandmother what the difference between explorer.exe and iexplorer.exe on a Windows machine and you will receive a blank stare. What has happened is that the linux merit batch, as I like to call it, no longer certifies your knowledge of linux but simply your endorsement of the open source movement by being open minded enough to use something different and see what the "under dog" has to offer. Your level of involvement and further reputation you have earned along the way will prove your skill level.

Conclusion:
The linux merit badge has lost a little power behind its punch but at the cost of betterment for the movement as a whole. The GNU/Linux world stands to gain a lot from the fact that your "average Joe" can use it as a desktop system without flaws. Will linux ever take over the desktop market? I don't know, nobody can really know, but I think we are at least stepping in the right direction of strengthening our user base in numbers and as time goes on and curiosity is sparked I think the next generation of linux users will educate themselves out of desire, not necessity. So basically, the "Yes" outweighs the "No" and I think the "No" will work itself out with time.

/me

Saturday, February 10, 2007

Ubuntu: crash, burn, or burst into flames....

Its been a while since I have posted, and to anyone who reads this I am sorry. The semester is full swing and I am taking a compiler theory class that appears to be rather time consuming, but enough about me....

As everyone knows, Ubuntu has teamed up with Linspire as described here and Linspire's CNR will be ported to Ubuntu's upcoming Feisty Fawn release.

What is my take on this?
Well, we can look at it from multiple points of view. At first glance I don't like the idea basically because of my old debian habits and ideals but if I take a moment to truly think about exactly what this will do for new linux users and general desktop users who don't want to bother with the command line, or the annoying complexity of synaptic, I can't seem to find myself to completely turn my back on the integration. Also, in the event that Ubuntu is given any say so in the development cycle of CNR I could see them taking it to a productive place that it has never reached. Ubuntu obviously has some brilliant developers on their team, staff or volunteer alike, and I would like to see where they could take the whole CNR idea.

I know, short and sweet ... but I had a sour taste in my mouth about it in the beginning but after some mental exercise about the topic I think it will turn out to be a positive addition to the Ubuntu world so there isn't a whole lot more to say.

/me

Wednesday, January 10, 2007

Fun quote

Windows XP - The 64-bit wannabe with a 32-bit graphics interface for 16-bit extensions to a 8-bit patch on a 4-bit operating system designed to run on a 2-bit processor by a company that can't stand 1-bit of competition