Friday, 30 March 2007

Wales on the Web - Gone but not forgotten

Last night we turned off the apache process for Wales on the Web. This was the project I was initially hired to work on here in The National Library of Wales, but the project ran it's course and it was time for us to injest the data into our Virtua system. I wrote MARC21 export scripts for the Wales on the Web Postgres relational DB Schema and we loaded the data into Virtua. By this point it was pointless to continue to run the old website alongside the data now available in our iPortal. So, last night I shut it off.

However, all that added value content from the old website has not been lost.

A few simple lines of PHP exported all of the static ontent pages held within the DB as iPortal template documents:



$sql = "SELECT * FROM lu_pages";

$records = pg_query($sql);

mkdir("/tmp/guides");
/* OK time for the logic. Fill a hash with the values */
while ($records_list = pg_fetch_assoc($records)) {
$filecontents = "\n";
$filecontents .= "

".$records_list['page_title_en']."

\n";
$filecontents .= $records_list['page_content_en'];
$filecontents .= "\n";
$filename = "/tmp/guides/".$records_list['page_id']."-en.html";

echo "writing: {$filename}";
$res = fopen($filename,"w");
fwrite ($res, $filecontents);
fclose ($res);

// file_put_contents($filename,$filecontents);

$filecontents = "\n";
$filecontents .= "

".$records_list['page_title_cy']."

\n";
$filecontents .= $records_list['page_content_cy'];
$filecontents .= "\n";
$filename = "/tmp/guides/".$records_list['page_id']."-cy.html";
// file_put_contents($filename,$filecontents);

echo "writing: {$filename}";
$res = fopen($filename,"w");
fwrite ($res, $filecontents);
fclose ($res);


}

?>



And some mod_rewrite rules on the iPortal server allowed us to resolve all of the old Wales on the Web URLs that people have linked to over the years.


# mod_rewrite rules used to keep Wales on the Web URLs valid
RewriteEngine on

# Rewrite English language domain to English e-Resources SKin
RewriteCond %{HTTP_HOST} www.walesontheweb.org$
RewriteRule ^$ /cgi-bin/gw/chameleon?lng=en&skin=eresources [L,R=301]

# Rewrite Welsh language domain to Welsh e-Resources SKin
RewriteCond %{HTTP_HOST} www.cymruarywe.org$
RewriteRule ^$ /cgi-bin/gw/chameleon?lng=cy&skin=eresources [L,R=301]

# Add trailing slashes to URLs
RewriteCond %{REQUEST_URI} ^/[^\.]+[^/]$
RewriteRule ^(.*)$ http://%{HTTP_HOST}/$1/ [R=301,L]

# Rewrite the CAYW Guides to use the vtls_link DIPORewriteRule ^cayw/guides/([^/\.]+)/([^/\.]+)/?$ /cgi-bin/gw/link/vtls_link.pl?file=http://cat.llgc.org.uk/gw/html/eresources/$1/static_content/guides/$2-$1.html&$3&skin=eresources&lng=$1 [L]

# Rewrite CAYW Dewey Decimal URLs to a Dewey Search on the e-Resources skinRewriteRule ^cayw/index/([^/\.]+)/([^/\.]+)/([^/\.]+)/?$ /cgi-bin/gw/chameleon?lng=$1&search=FREEFORM&function=INITREQ&elementcount=1&t1=dc:$2.$3&skin=eresources

# Rewrite CAYW Dewey Decimal URLs to a Dewey Search on the e-Resources skinRewriteRule ^cayw/index/([^/\.]+)/([^/\.]+)/?$ /cgi-bin/gw/chameleon?lng=$1&search=FREEFORM&function=INITREQ&elementcount=1&t1=dc:$2&skin=eresources


So, now you can look at the Curriculum Cymraeg at it's original address:

http://www.walesontheweb.org/cayw/guides/en/1/

or look at a list of Dewey Decimal catalogued websites for National Assembly Government Members:

http://www.walesontheweb.org/cayw/index/en/324/2093/

I hope these little titbits are useful to someone else out there at some point. They have served me well in these past weeks.

No comments: