About Me

My photo
PhD Candidate at Purdue University, Computer Science.

Sunday, February 03, 2008

Boutros-Boutros Ghali Archive

foI have been assigned to the Boutros Ghali archiving project since I moved to Bibliotheca Alexandrina.

My task description was to supervise/organize the digitization of the 31,650 documents, build an automated dataentry engine, and finally build a site to publish the documents.

The collection includes various documents related to the positions occupied by Boutros Ghali as:

1. The Egyptian Minister of State for Foreign Affairs from 1977 till early 1991. This group includes important documents pertaining to Egyptian foreign affairs, such as the Arab-Israeli conflict, the Camp David accords and Egypt’s role in the African and Arab region.

2. The Secretary-General of the United Nations from January 1992 to December 1996. The BA has obtained the documents of his Excellency’s meeting notes for years 1993, 1994, and 1996.

A great effort was exerted in order to categorize the documents based on their type, their language, and finally their contents. So far the following categories have been established: reports, letters, press, treaties, speeches, conversations, and meeting notes. The collection contains documents in a numerous languages, including Arabic, English, French, Spanish, German, Italian, and Indian. The contents of the documents have also been categorized under UN, Arabic Israeli Conflict, Africa, Middle East, South America, North America, Asia, and finally Europe, as Egypt’s foreign relations were extending all over the world.

In order to achieve this classification, an indexing application was developed, working as the backend that helps to reorder the scanned documents, assemble related pages in one job entry, assign each job to its category and finally insert the metafield values (date, title…etc.)

Furthermore, the indexing application reduces the time required for data entry by directly OCRing the portions of the pages having the important fields and saving their values to the Database.

A website interface is being designed and the site itself is currently under development for the publishing of the collection.

No comments: