Welcome, Guest. Please Login
 
  HomeHelpSearchLogin FAQ Radified Ghost.Classic Ghost.New Bootable CD Blog  
 
Pages: 1 2 3 4 
Send Topic Print
PHP and regular expressions (Read 62214 times)
Nigel Bree
Ex Member




Back to top
Re: PHP and regular expressions
Reply #30 - Jun 17th, 2008 at 10:30pm
 
Rad wrote on Jun 17th, 2008 at 3:50pm:
Q2: Why would anybody want to translate a high level language into a lower-level (intermediate) language? Higher is usually better than lower, no?

For humans, yes. Computers, though, have to ultimately do everything at the level of their CPU instructions.

Getting from point A (high level, human-readable) to point B (CPU instructions encoded as numbers) requires a lot of work.

Of course, that work is mechanical, and so we can get computers to do that, too. So we have programs to translate programs.

...but what do we write those in?

As the old joke goes, if the world is a disc sitting on the back of an elephant, what are the elephants standing on?
The answer is: another elephant. In fact, it's elephants all the way down.

Working with PHP, you're working with a particular kinds of program called an interpreter. Part of that first translates your PHP program into an internal form (compiling it to an internal program) and then another part of it actually does the work of running the program (interpreting that compiled program).

Writing PHP, you're moving down a level to the world of strings and numbers.

But there's another level inside that, which is the PHP interpreter. And another world inside that. And often, there's another world inside that.

[ In fact the CPU instructions aren't what the hardware really runs, either. The CPU itself has a little thing that re-encodes those too, into an even lower level form, designed for the particular arrangement of gates and switches and registers inside it. In the old days those were an interpreter inside the CPU called microcode, these days they are still there but aren't as easy to describe. ]

From go to whoa, there are actually can be an enormous number of these onion-like layers present.

The thing is, interpreters actually aren't all that different to other programs. When we create very general programs, they in turn often become programmable things - steps in a bigger process.

...your old Guide to Norton Ghost is, in many respects, just a high-level program, designed to be run with human beings and other programs as its component pieces instead of strings and numbers...

And while you probably thinking in terms of the PHP programs you write being aimed at the level of humans (which they certainly can be), in fact lots of Web development is actually aimed at web transport of data that are used by other programs (the most common kind of those being Javascript programs that run in the browser, but also plenty of that happens between web servers too).

The distinction between programs that work with human beings and programs that work with other programs and then finally the programs that work on other programs to rewrite them from high-level to low-level forms can ultimately become quite subtle.
 
 
IP Logged
 

Nigel Bree
Ex Member




Back to top
Re: PHP and regular expressions
Reply #31 - Jun 17th, 2008 at 11:20pm
 
Rad wrote on Jun 17th, 2008 at 3:50pm:
Q3: "THE lambda calculus"?? .. and not just (regular, ol') "lambda calculus"?

It gets to be called "THE" because it's so incredibly important; as I mentioned earlier in the thread (and as the book talks about) even though it was created in the early 20th century as a piece of mathematical logic, it captures so perfectly what computing is that it has taken on an incredibly special status in theory.

Functional programming languages are especially beloved in academic research purposes because being structured as equations, you can prove theorems about such programs in general. The lambda calculus has become the vehicle in which most of that theoretical work has historically been done, and so by defining as direct a possible a transformation into it, new functional languages can easily pick up and have that theory applied to them as well.

And of course, it's practical too. The Lisp and Scheme programming languages are almost direct vehicles for lambda-calculus expressions, and those being so venerable (Lisp c. 1956 being almost as old as electronic computers themselves) have been well studied in terms of how to run them efficiently, so translating to them is a practical shortcut used in "bootstrapping" a brand new functional language in an elegant way.

[ The equivalent status for "bootstrapping" imperative languages is to use translation to C, but there the purpose is almost purely just the practical one of C being ubiquitous, and its ubiquity derives from C having a clear route for translation into CPU instructions. ]
 
 
IP Logged
 
Nigel Bree
Ex Member




Back to top
Re: PHP and regular expressions
Reply #32 - Jun 17th, 2008 at 11:36pm
 
Rad wrote on Jun 17th, 2008 at 3:50pm:
In learning PHP, I've come across references to 'C' .. and a book that keeps coming up in the "K&R" book: http://www.amazon.com/C-Programming-Language-2nd-Ed/dp/0131103709/ Have you heard of this?

Indeed I have, and in fact I've had the pleasure of meeting Dennis Ritchie in the flesh a couple of decades ago (during one of his visits to New Zealand he dropped in to Auckland University).

C as a language is in many way spectacularly unremarkable. However, in terms of being in the right place at the right time, it's the all-time top dog. The story of how it came to be is part of that - a spare PDP-11 turning up at Bell Labs and Dennis Ritchie and Brian Kernighan being at a loose end and writing a little hobby operating system to occupy themselves.

That PDP-11 didn't have any operating system, so Ritchie started to write one. Mostly in PDP-11 assembly language, which was a pain, so Brian Kernighan started to write a compiler for a higher-level language on it.

Brian Kernighan had earlier looked at a quite simple programming language called BCPL designed in Cambridge - BCPL was very low-level, built on a deliberately simple model of computer memory - and as part of exploring it had written a subset of it (being a subset, called B). The brand new processor for the PDP-11 had a very simple design which turned out to fit the model in BCPL almost perfectly, so to program for that PDP-11 they knocked up a more complete subset of it (having used B, called C).

And each time Dennis Ritchie improved his little OS, more and more of it became written in C, until eventually it all was, and eventually the C compiler itself ended up written in itself too.

And the source code for that little OS - called UNIX - was published, along with the C compiler itself, and got sent to some Universities who also had PDP-11's. And the effect was quite remarkable.

When the microprocessor really started to take hold, the design of those early microprocessors (the 6502 and 6800 and such like) was basically taken from the PDP-11; 8-bit bytes and 16-bit addresses and the ASCII character set were all things associated with the DEC series, and so as the microprocessor started to really take off, C (even more so than UNIX) was a great fit for it.
 
 
IP Logged
 
Nigel Bree
Ex Member




Back to top
Re: PHP and regular expressions
Reply #33 - Jun 17th, 2008 at 11:56pm
 
Rad wrote on Jun 17th, 2008 at 4:10pm:
are the terms 'functions' and 'statements' interchangable?

That's actually quite a deep one to answer.

The short answer is that in imperative languages, they mostly are interchangable, because functions are mostly just a different kind of statement that happens to compute a value (a value being a number or a string, usually). In imperative languages, the central concept is the statement - the commands you issue, and while functions exist they exist in the context of, and themselves are built out of, statements.

Oddly enough, they are mostly interchangeable in functional languages too, because in those everything is a function, and if you want to do something that doesn't need to return a value by convention there's a special "no value" kind of value called "nil" or "null".

Sometimes books use the term "statement" more narrowly and apply it more to the special built-in constructs like "IF" and "WHILE" and such, because although almost everything is a statement, these things which provide the "control structures" for the language are the most special kinds of statement.

 
 
IP Logged
 
Nigel Bree
Ex Member




Back to top
Re: PHP and regular expressions
Reply #34 - Jun 18th, 2008 at 12:35am
 
HIGH RADIATION ALERT!

Ok, so having popped over to Stevey's blog to get some linkage for this thread I saw a new article had been posted and settled into a fun read....

cue the Twilight Zone music please...

Stevey writes:
Quote:
when I was in the Navy Nuclear Power School program in Orlando, Florida

OK, now I'm creeped out. That's just too scary a coincidence.
 
 
IP Logged
 
Rad
Radministrator
*****
Offline


Sufferin' succotash

Posts: 4090
Newport Beach, California


Back to top
Re: PHP and regular expressions
Reply #35 - Jun 18th, 2008 at 8:03am
 
That *is* a coincidence.

Quote:
So we all think we're smart for different reasons. Mine was memorization. Smart, eh? In reality I was just a giant, uncomprehending parrot. I got my first big nasty surprise when I was in the Navy Nuclear Power School program in Orlando, Florida, and I was setting historical records for the highest scores on their exams. The courses and exams had been carefully designed over some 30 years to maximize and then test "literal retention" of the material. They gave you all the material in outline form, and made you write it in your notebook, and your test answers were graded on edit-distance from the original notes. (I'm not making this up or exaggerating in the slightest.) They had set up the ultimate parrot game, and I happily accepted. I memorized the entire notebooks word-for-word, and aced their tests.

i recall rejurgitating 'keywords and tricky phrases'. orlando is where i went, too.
 
WWW  
IP Logged
 

Rad
Radministrator
*****
Offline


Sufferin' succotash

Posts: 4090
Newport Beach, California


Back to top
Re: PHP and regular expressions
Reply #36 - Jun 19th, 2008 at 6:55pm
 
Can't help but wonder what class he was in. I was in 7709 .. (began Sept 1977).

I never recall hearing anyone being compared to previous classes. For all I know that data was not available.

Regarding this script (it *is* called a 'script,' right?):

Code:
printf("%d bottles of tonic water cost $%f", 100, 43.20); 


I'm kinda wondering WHY the $%f is NOT a variable (since variables begin with $...).
 
WWW  
IP Logged
 
Nigel Bree
Ex Member




Back to top
Re: PHP and regular expressions
Reply #37 - Jun 19th, 2008 at 7:36pm
 
Rad wrote on Jun 19th, 2008 at 6:55pm:
I'm kinda wondering WHY the $%f is NOT a variable (since variables begin with $...).

Another good question.

This is due to a subtle thing that is present in computer and natural languages, where the meaning of a thing - such as in this case the '$' - isn't determined just by itself, but also by its surrounding context. Sometimes that context is close by, sometimes it's far away. Sometimes it's on the left of the thing you're looking at, sometimes on the right, sometimes both.

[ Being very technical for a moment, this context sensitivity is important; it's degrees of context sensitivity that are underpin the hierarchy of mathematical power that divides what regular expressions can do from the more powerful grammars. ]

So for instance, the meaning of a $ within a double-quote isn't necessarily the same as the meaning outside one. The rules that apply within the double-quote about the meaning of such strings are not the same as - indeed, conflict with a small degree with - to the ones for the meaning of $ outside.

In particular, the PHP system has to decide for every character in the double-quoted string whether it means "itself", which is the primary meaning of things in strings, or "escapes out" of that context to draw in something from another part of the language, such as to to substitute a variable value (which PHP calls interpolation, a term not used elsewhere for this).

The other thing is that although there is a way to say explicitly in double-quoted strings that the '$' is just a '$' by using '\$', PHP as a scripting-type language tends to a philosophy that rather than generating an error, it tries to do its best.

So, when encountering the $ in the left context of a double-quoted string, the PHP interpreter doesn't immediately decide what it is, but starts drawing in the right context of what follows the string to see whether it can make sense of that as the name of a variable to substitute in.

If it can't make sense of it - because the simple syntax - doesn't match, which in your example it doesn't because the next character after the $ is %, which isn't a valid variable name - then it has a choice.

Although PHP could legitimately choose to raise an error because the expression makes no sense, the implementation could instead choose to simply guess that you meant to use \$ but forgot.

Languages, particularly imperative languages, are full of these kind of policy choices, because they are not designed with any reference to mathematical principles but grown by use. And just as with human languages, the process of growth by use leads to ambiguity and puns, and the language interpreter having to apply what could be called disambiguation heuristics to try and guess a meaning for things that are not actually well-defined.

These things are concessions to usability, by trying to let humans get on with "do what I mean you silly machine" instead of the compiler being seen as picky, but they complicate the implementation of the language itself immensely, and they tend to be the kinds of things that are the most susceptible to breaking horrible as the language evolves, because a later extension to the language may ascribe a meaning to something formally underfined before, or a rewrite of the code may accidentally not preserve a bug which disambiguated things one way and not another.
 
 
IP Logged
 
Rad
Radministrator
*****
Offline


Sufferin' succotash

Posts: 4090
Newport Beach, California


Back to top
Re: PHP and regular expressions
Reply #38 - Jun 19th, 2008 at 7:58pm
 
while i read your last, another question: i keep seeing references to to 'decimal' integer..

.. oh, nevermind. i see decimal is simply base-10 integer. i follow. i would've chosen another term than decimal .. which conjures up ideas of floats (decimal points mean you no longer have an integer).
 
WWW  
IP Logged
 
Rad
Radministrator
*****
Offline


Sufferin' succotash

Posts: 4090
Newport Beach, California


Back to top
Re: PHP and regular expressions
Reply #39 - Jun 19th, 2008 at 8:02pm
 
Quote:
degrees of context sensitivity that are underpin the hierarchy of mathematical power that divides what regular expressions can do from the more powerful grammars

I thot regular expressions *were* powerful .. re the title of this thread.

Okay, now that you got me curious, what are "the more powerful grammars"??
 
WWW  
IP Logged
 
Nigel Bree
Ex Member




Back to top
Re: PHP and regular expressions
Reply #40 - Jun 19th, 2008 at 8:10pm
 
Rad wrote on Jun 19th, 2008 at 6:55pm:
Regarding this script (it *is* called a 'script,' right?):

PHP programs generally are, yes, just as UNIX shell scripts and Perl script and Javascript.

That said, though, the distinction between what's script and isn't is really pretty fuzzy and there is no formal definition of what is and isn't.

Things that are "load-and-go", where the language system works by sucking in the source code and then immediately doing what the source code says are the things that are most usually called "scripts". If the source code is pretreated by a separate toolchain first so that the original source code isn't what's used at the time things go, that's usually not called script.

Really though, that distinction isn't all that firm. There are systems which don't work as above, and other properties (such as whether the language possesses a static type system or a dynamic type system) that tend to split down a similar line but which don't capture the essence either.

At the end of the day, whether something is a script or not is as much about how we think about the program rather than being an intrinsic property of the language (or the implementation of the language). Things that work is a "script-y" way tend to have a lot of features in common (load-and-go, dynamic type system) as a result but that's a consequence, not a cause, of what it is that people tend to do with them.

 
 
IP Logged
 

Rad
Radministrator
*****
Offline


Sufferin' succotash

Posts: 4090
Newport Beach, California


Back to top
Re: PHP and regular expressions
Reply #41 - Jun 19th, 2008 at 8:11pm
 
Quote:
So for instance, the meaning of a $ within a double-quote isn't necessarily the same as the meaning outside one. The rules that apply within the double-quote about the meaning of such strings are not the same as - indeed, conflict with a small degree with - to the ones for the meaning of $ outside.

Note that examples *have* been given where $variables are located INSIDE the dbl-quotes. (in a print statement). So .. it's not defined obviously solely by being inside/outside "".
 
WWW  
IP Logged
 
Nigel Bree
Ex Member




Back to top
Re: PHP and regular expressions
Reply #42 - Jun 19th, 2008 at 8:20pm
 
Rad wrote on Jun 19th, 2008 at 7:58pm:
i would've chosen another term than decimal

The more you work in non-decimal number systems (binary, octal, hexadecimal) the more you get used to explicitly calling out the cases of decimal numbering.

[ This relates of course, to cultural conventions of currency and weights and measures too. "Decimal currency" is a reasonably common term in the British Commonwealth because the conversion from the non-decimal Imperial system to a base-10 one is still mostly within living memory. ]
 
 
IP Logged
 
Nigel Bree
Ex Member




Back to top
Re: PHP and regular expressions
Reply #43 - Jun 19th, 2008 at 8:33pm
 
Rad wrote on Jun 19th, 2008 at 8:02pm:
I thot regular expressions *were* powerful .. re the title of this thread.

Okay, now that you got me curious, what are "the more powerful grammars"??

Well, as I mentioned earlier it turns out that most attempts to describe languages through grammars, the grammars themselves turn out to fit into the class of things which are equivalent to the lambda calculus (aka "turing-equivalent"), and which therefore are fully capable of acting as computer languages themselves.

Regular expressions, as formally described, don't fall into that class. In between regular expressions and turning-equivalent are some other finely-graded categories of things; mapping out these is of interest to Automata Theory, and context is a crucial thing that distinguishes the classes.
 
 
IP Logged
 
Rad
Radministrator
*****
Offline


Sufferin' succotash

Posts: 4090
Newport Beach, California


Back to top
Re: PHP and regular expressions
Reply #44 - Jun 19th, 2008 at 8:37pm
 
Quote:
And the source code for that little OS - called UNIX - was published, along with the C compiler itself

This guy invented *both* C and Unix?
 
WWW  
IP Logged
 
Pages: 1 2 3 4 
Send Topic Print