This is not ProjectBloodhound material, at least not first semester stuff. But if you find yourself running into highly structured data — such as the reports from a spreadsheet or a database application — you have the ability to easily manipulate that data in PHP.
This is a simple example, but you don’t have to limit yourself to doing simple things. Imagine a data structure like this:
Name[tab]Phone Number
Cathleen Collins[tab]602-369-9275
Greg Swann[tab]602-740-7531
In the file the code shown here as “[tab]” would be an actual tab character, and this kind of data goes by the arcane name of: A tab-delimited file.
Most programming languages were written by exacting people with abstract and elegant reasons for everything they did. PHP was written by overbooked programmers who needed to pound out new web pages as quickly as possible.
In consequence, PHP is optimized for dealing with highly structured data. Here is a short program that will take a tab-delimited phone number file as input and output reformatted phone numbers into the HTML stream. In other words, this code could produce a dynamically-updated phone list in what what might otherwise be a static web page:
<?PHP auto_detect_line_endings; $fi = fopen("PhoneNums.txt","r"); $line = fgets ($fi, 4096); // throw away fieldDef line echo ("<b>Phone Numbers</b><br>"); while (!feof($fi)) { $line = fgets ($fi, 4096); list ($Name, $Phone_Number) = explode ("\t", $line); if ($Name) { echo ("$Phone_Number <i>($Name)</i><br>"); } } fclose ($fi); ?>
There is one line that makes all the difference for this kind of work:
list ($Name, $Phone_Number) = explode ("\t", $line);
The stuff between the parenthesis are our known field names, and we’re using them as variable names for clarity’s sake. The explode function will create an array of separate fields from the text stored in the $line variable, splitting the fields on the tab character. The list function then inherits the array just created by explode and assigns each field to the appropriate field name variables. We only have two fields in this case, but I have a variation on these ideas that parses an MLS database that contains 213 fields per line of text.
Once we have the fields assigned to the right variables, it’s duck soup to represent the data in whatever format we wish. Alternatively, we could write a new file out to disk. The routine that parses the MLS database writes XML files to disk using a few dozen of the available fields — and throwing the rest away.
In fact, from here it’s very easy to write XML files, such as those used by Realty.bots. Say so if you want to see a demonstration.
But there is a lot more that you can do with software like this. It’s common, when you get data that is almost what you want, to try to edit it in word processors or text editors. A parsing tool like this enables you to take complete control over the data, echoing it back as perfectly-formatted HTML or writing a formatted file out to disk.
Cheryl Johnson says:
Greg,
Sorry, some of us (me included) still need the version with training wheels. π
So… I copy and paste the phone number data into a text editor and name the file PhoneNums.txt. OK, so far.
Then I copy and paste the PHP code into … what? Into an HTML document? A blank document? And name it RunPhones.php?
Then upload both files to a PHP-enabled server, and when I go to mysite.com/RunPhones.php, the “exploded” display will appear?
June 20, 2008 — 3:41 am
Teri Lussier says:
Cheryl-
*That’s* training wheels?!?! Oh dear. I need the trike version. I’m off to Da Blog Mother archives.
June 20, 2008 — 5:16 am
Cheryl Johnson says:
Another question: Suppose I pasted the PHP code into a WP sidebar.php file… Would I need to change the PHP fopen line to the absolute file path-the complete URL of where PhoneNums.txt is located?
(Obviously I haven’t yet got it working either way, or else I wouldn’t be asking π )
June 20, 2008 — 6:14 am
Cheryl Johnson says:
BINGO! http://www.nelanews.info/
There, now ~I~ can write the training wheels version. Though I ‘pose, I could change PhoneNums.txt to my phone numbers,
June 20, 2008 — 6:20 am
Cheryl Johnson says:
Except its displaying the brackets and the word “tab”
June 20, 2008 — 6:21 am
Greg Swann says:
> Except its displaying the brackets and the word βtabβ
You’re almost there. You have to change [tab] to a real tab character, which I can’t show in a weblog post.
Watch this:
That’s the field definition line if you download your activity report from PayPal. To name these as fields in PHP, we would need to lose the spaces, then convert the tabs into comma-space — like this:
From there, you have the ability to parse that report any way you want it.
This
will parse your iTunes library.
Not all data is well-structured — Microsoft products, as you might expect, introduce dysfunctional crap into everything — but a lot of the data you’re going to run into on the web will come to you just this way — tab- or comma-delimited with a field definition line. A parser like the one shown in this post is an easy way to manipulate that data to any other purpose: Formatted display on the web, edited or reorganized in another file or rendered as XML for another piece of software to devour.
June 20, 2008 — 7:24 am
Greg Swann says:
> Then I copy and paste the PHP code into β¦ what? Into an HTML document? A blank document? And name it RunPhones.php?
Yes, or you could paste it in as a part of a standing PHP page. Inlookers: If you intend for PHP to “see” and process your code, the web page has to be named MyPageName.php. It’s okay if the page contains nothing but HTML, but the PHP parser will not act on any PHP within the page if it is named MyPageName.htm or MyPageName.html. It’s good practice to name all your new pages MyNewPage.php. That way PHP will be available to you now — even if you don’t need it — or later — when you might.
Cheryl, here’s another way of thinking of this: You’re building “PhoneNums.txt” so that you or anyone — or a piece of software — can update the phone numbers without messing with the code. What if you were to isolate your call to the code, so that it can be changed without your having to change each file that references it? Instead of pasting in the code, you could save it to a separate file, then include it wherever you want it:
When you edit “RunPhones.php”, the changes will be reflected instantly everywhere you have “included” it. And, yes, in this circumstance, I would use an absolute path, as I’m showing here.
June 20, 2008 — 7:43 am
Cheryl Johnson says:
Oh. Duh on the tab thing.
Here’s the part I don’t know nuthin’ about yet: If I want that PHP code to run somewhere other than just a WordPress sidebar … if I want it to do things locally just on my own computer …. I’m going to need to install a web server app on my machine, right?
June 20, 2008 — 7:43 am
Greg Swann says:
> If I want that PHP code to run somewhere other than just a WordPress sidebar β¦
You can run PHP on any Apache web server, if the file is named FileName.php. A PHP file can contain any valid HTML, plus any valid PHP. The real purpose of PHP is to produce valid HTML at runtime. In other words, if you View Source in a PHP page (lots of them on my sites), you will never see anything except valid HTML, even though some, most or all of it will have been rendered at runtime by software. Revisit my discussion of a contributor’s blogroll as an example.
> if I want it to do things locally just on my own computer β¦. Iβm going to need to install a web server app on my machine, right?
That’s right. I don’t know how to do this in the Windows world. It’s baked in the cake on any OS X Macintosh — you have to download PHP, but every Mac is an Apache web server out of the box. Even so, I almost never use localhost, not even for testing. I either edit locally and FTP to a server (we have one we use for testing) or just edit directly on the server. For this, you need an FTP client that integrates with a text editor that will in its turn open and save files directly from a file server. In a year or two, the integration between onsite and offsite storage will be complete and you will FTP in and out of your file servers just as if they were hard disks mounted on your desktop.
It seems a little weird to go to a file server for everything, but I have three reasons for working this way. 1. I’ve lived most of my adult life using desktop compilers, and, while they are a lot more robust, they’re a big pain in the ass to work with, where quick-and-dirty PHP might be pretty damned dirty, but it’s pretty damned quick. 2. Almost everything I do now is bound for the web anyway, so there’s no reason to solve problems locally, just to solve them again on the web server. 3. Anything I do in PHP I can share with anyone else on Apache servers, without worrying about the pestilential virus known as Microsoft Windows.
Cameron keeps telling me that I need to learn Ruby on Rails, and probably I do. PHP was written by recovering C programmers, so it slides right into my mind with no effort, which means I can punch things out without having to puzzle them out. For web-based programming, it seems like an optimal solution for me, right now: I can do anything I want — including something as complex as engenu, which is written entirely in PHP — and I can pound out bread-and-butter stuff with alacrity. I think PHP is worth knowing, and I don’t think there has ever been a better time for ordinary (non-geek) people to learn to write software.
June 20, 2008 — 8:09 am
CJ, Broker in NELA, CA says:
Aside: Years ago a C programmer told me that C was a garden of delight, then C++ came along and crapped on the flowers….
June 20, 2008 — 10:18 am
CJ, Broker in NELA, CA says:
Re MyPage.php …. OK … Got that working http://www.nelanews.info/testrun.php
I haven’t tried yet, but I’m supposing I could specify a stylesheet in the header of that page, and the page would then echo that design?
June 20, 2008 — 11:19 am
CJ, Broker in NELA, CA says:
OK. Got it running as a single page. http://www.nelanews.info/testrun.php
(No, I haven’t fixed the tab thing yet.) I haven’t tried yet, but I’m supposing if I specify a stylesheet in the header, the page will then echo that style, just like a regular html page?
June 20, 2008 — 11:31 am
Greg Swann says:
Check. By the time the http handler sees it, it’s all HTML. The contributor’s blogroll in the sidebar is inheriting the sidebar’s CSS. If I were to call that same routine from a post, it would look like a list in a post instead.
June 20, 2008 — 11:45 am
Cheryl Johnson says:
Oh. My. Goodness. I’m slowly getting there:
http://www.nelanews.info/SuperBowlHistory.php
(no stylesheet yet – just working on concept)
What if I wanted the data to populate a table?
June 20, 2008 — 7:02 pm
Greg Swann says:
You’ve got it. The CSS part is easy. HTML is HTML.
> What if I wanted the data to populate a table?
Easily done. Remember that the HTML surrounding your variables is just HTML. You would format a table the same way you would do it manually, but you only have to do the job once.
For something like this, I would uses a function:
This is more from the K&R world. When you call explode or list, you’re calling a function, it’s just built into PHP. We can create our own functions for isolating and simplifying repetitive jobs.
Now instead of doing your echo‘s in your main loop, you would call
instead. Pushing an ugly job like this off to the function makes the code easier to maintain and more self-documenting.
June 20, 2008 — 7:25 pm