banner-image
Clients

Typo 3 to Drupal Migration for Greenlandic Broadcasting Corporation

Migrating over 60,000 stories and media from Typo3 to Drupal, with a migrate module.

Greenlandic Broadcasting Corporation (KNR), is the official TV and Radio Broadcasting agency of Greenland. Their website www.knr.gl was originally in TYPO3 (now available at: http://typo3.knr.codespry.com/). 
  • Migrated over 60,000 news stories; and over 12000 photos to a Drupal database

  • Multilingual website - available in Greenlandic, Danish and English

  • Videos being served from Kaltura

  • Advanced Search features using the Apache Solr search engine

How We Got the Project
KNR

We had won an international bid to re-design and re-build the website in TYPO3. From 2004 to 2009 we had been working on TYPO3 extensively which is what won us the project. However, during the post-award analysis and design phase, we recognised that the project would be better of done in Drupal instead of TYPO3. We had had an experience building www.indiaenvironmentportal.org.in, an environment news portal, in Drupal and the experience gained from this development led us to the belief that news and online media publishing sites should be built in Drupal. We had the benefit of having seen both thesystems from close quarters.

Key Reasons for choosing Drupal over TYPO3

We'll start with a disclaimer that we really liked working in TYPO3. It works very well for some applications, such as for intranets. Our reason for choosing Drupal over TYPO3 for this website was because of the following rationale.

  1. Concepts of "Taxonomy" is central to Drupal. These could be a "tag library" which is used for tagging all stories written in a website. TYPO3 had no concept of such a Taxonomy; not to date.

  2. Availability of Apache Solr integration module. This module came much later in TYPO3, and is just now about reaching maturity, and that too in a near-commercial model by the agency which wrote it. The Taxonomy concepts are easily hooked onto Apache Solr for filtering based on meta-information, such as Tag library, Authors, Publications, etc. This can be seen in action beautifully at www.indiaenvironmentportal.org.in

  3. The TYPO3 Backend from for a News publishing website with hundreds and thousands of articles, added a level of usability complexity.

  4. Meta information such as "Authors" are simple entries in tt_news (a core TYPO3 extension - actually the heart of TYPO3), are simple Label entries; in Drupal, however, an Author can be part of a Taxonomy and complete User Profiles can be made for these authors - by default

  5. Article/News creation by way of simple forms. Easy bulk publish/unpublish of news

  6. Complex news-news and news-article relations easily managed using in-bulit relationships in Drupal

We must acknowlege another key reason. A lot of our developers, after the www.indiaenvironmentportal.org.inexperience started loving working with Drupal more than in TYPO3 because of the "control" they experienced - such as the ability to write Themes using PHP Template, instead of depending upon Typoscript, which was the native TYPO3 scripting tool for theming TYPO3 websites. Typoscript adds advantages for non-technical people, but developers and themers often dislike it.

Management Challenges

We faced many challenges during the design and ideation of the project, which were related to issues of management (at the client and our end) of the project and attrition challenges at Srijan. Both us and the client were concerned about the future of the project. However, both the client and Srijan stayed committed to the relationship. Soon, the client appointed a full-time Project Manager for the project, and Srijan re-committed itself to the project.

Enter OpenPublish

Open Publishrealised soon that our product had become unstable over the several months of work in starts-and-stops. We'd been wanting to work in OpenPublish for a long time now, and saw the KNR site as an ideal case for a move to OpenPublish. Srijan's committment to its clients reflected here. We invested in a research team to use KNR as a case, and work on OpenPublish. 3 weeks were given to a 3 people team. All this was done at Srijan's own investment and initiative with minimal investment (only in the form of a regular maintenance signup).

What is OpenPublish

As the OpenPublish website describes, it is:

"OpenPublish has been designed to meet the needs of any publisher – whether large newspaper, TV news site, niche information publication or something in between. It is a flexible solution easily tailored to fit any organization’s needs."

It is based on Pressflow, a performance-tuned implementation of Drupal, has Memcached and Varnish implemented by default, and an Apache Solr integration built in. See the complete feature here.

Research complete; time to roll

It took the same team another 5-6 weeks from research completion to get the website live at www.knr.gl, including migration from the TYPO3 website to OpenPublish.

Migration from TYPO3 to Drupal

Our starting point was the Drupal migrate module and the case study written for migration of The Economistmagazine to Drupal.

Analysis of the data to be migrated

We studied the data that in TYPO3 that needed to be migrated. Here's the metric of content we identified to migrate, and eventually migrated.

Migration Tables

 Do note that Gallery images were migrated in a different manner, and it is for this reason  that the above screenshot shows 0 in the "migrated column". For migrating the gallery  images we used simple PHP scripts, which also took care of "incremental migrations".

 Challenges with TYPO3 DB migration

 There were several challenges mapping the TYPO3 database structure onto Drupal. This  challenge was magnified due to a poorly implemented TYPO3 setup on the KNR website.

 Poor TYPO3 implementation

 The TYPO3 implementation done for KNR was a BIG mess. Here are some examples:

  1.  The site allowed photographers to register and upload their photographs in a  TYPO3 extension called smooth_gallery. However, instead of one instance of the  gallery to manage all photographers and their photos in albums, a separate TYPO3  page was created for each photographer with their respective name, an an instance  of smooth_gallery created and embedded into the page. smooth_gallery further  created a folder with the photographer name in which all images were finally stored.
  2.  There were 1800 pages, one for each of the 1800 registered photographers of the  KNR website

 Differences between the TYPO3 DB structure and Drupal DB structure

  1.  An "Author" (internal users at KNR) of a tt_news news story entry is a simple label  entry. Therefore, while the same author may have entered several news items, the  name of the same are stored multiple times simple as a field entry. The email entry of the same author could be different. However, while migrating this to Drupal, we had to ensure integrity of data in terms of the author profiles being made for internal as well as for external users - Photographers who registered on the site to upload their photos.

  2. These photos were residing independently in folders, and had to be made available to the News editors for use in the News Stories in the website. Therefore a Digital Assets repository had to be implemented.

  3. In Drupal, however, an Author can be part of a Taxonomy and complete User Profiles can be made for these authors. Also, the photographs and photo-galleries they made, had to be associated with their profiles

Intermediate Database design

To handle the above situations an intermediate database schema had to be prepared. This would a clean migration of content between TYPO3 and Drupal, according to their own structures.

TT_News Relation

Photo Gallery relation

Incremental Migration

Since the KNR webite (TYPO3) was in production, post UAT, the content would have to be continually migrated; the new Drupal website would have to start serving with the live real-time content. For this an incremental migration process had to be followed for News stories (including images) and for the Photo Gallery and any new photographer user registrations.

Here's a sample of the code we used for such incremental migrations for News stories:

$data['name'] = str_replace($data['path'],'',$

data['name']);
        $node->type = $type;
        $node->title = $data['caption'];
        $node->uid = $uid;
        $node ->status = $status;
        $node->created = $data['crdate'];
        node_save($node);
        $nid = $node->nid;
        $node->nid = '';
        $fileInsert = "insert into  files (uid,filemime,filename,filepath,status,timestamp) values
                            ('" . $uid ."', '" . $fileMime ."', '" .$data['caption'] . "', '" . $filePath . $sourcePath . $data['name']
                                    ."', '" . $status . "', " . $data['crdate'] .")";
        db_query($fileInsert);
        $fid = mysql_insert_id ();
       
        $galleryInsert = "insert into node_galleries (gid , nid, fid) values (" . $gid .", " . $nid . "," . $fid . ")";
        db_query($galleryInsert);
        $updateStatus = "update " .$typo3Db . "migrate_image_status set data =" . $data['uid'] . " where name like 'image_migrate'";
        db_query($updateStatus);
 

Converting Latin1 charset tables with UTF8 data set

The TYPO3 site was multilingual - English, Danish and Greenlandic. The TYPO3 DB had Latin1 charset tables with UTF8 data stored (Are you sure about this? How do you know?) which needed to be converted to UTF8 for a Drupal database.

Our initial approach was to change the DB and table charset to UTF8, which would convert Latin1 data to UTF8 with commands like:

  1. ALTER TABLE {tablename} MODIFY {table column} CHAR(20) CHARACTER SET utf8
  2. ALTER TABLE {tablename} DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci

But this was actually not working. After searching through Google we came upon this post - http://bit.ly/1RAqTO - which gave us the breakthrough. The solution was to convert the fields to BLOB, and then the BLOB field to UTF8. A more detailed case of this is available at: http://www.srijan.in/blog/converting-latin1-charset-tables-utf8-data-set

The TYPO3 migrate module

We've written a TYPO3 migrate module during the process of this KNR website migration, as well as for another client East West Center (coming up soon).

Business Benefits

  • Srijan worked like a true partner to KNR, enabling them to move the platforms that would serve their needs better, even incurring costs to ensure that KRN received the Drupal implementation that would work better.

  • KNR saw a smooth migration of a huge number of photographs and news stories, even with the TYPO3 website being live till the OpenPublish version was ready.

  • Srijan enabled an end-to-end solution for KNR by bringing in design partners to handle concepts and wireframes.

Learnings for the future

Updated Content Migrations

We had not utilized a feature of migrate_module, which allows for migrating updated content records - such as News, Articles stories - which have already been migrated, and were updated post this migration. We had instead compiled all such updated records based on a Date indicator, and migrated them separately by first doing a roll-back of these stories, and then re-migrating them using the migrate_module itself.

In our next migration, we would like to use the update feature of the migrate_module.

If you wish to get in touch with us, drop us a line below.