Date of Award

May 2014

Degree Type

Thesis

Degree Name

Master of Science

Department

Computer Science

First Advisor

Ethan V. Munson

Committee Members

Tian Zhao, Cheng Thao

Keywords

LibreOffice, Molhado, Revision History, Unique Identifier, Version-Aware, Version-Aware-Document

Abstract

Version control systems provide a methodology for maintaining changes in a document over its lifetime and provide better management and control with evolving document collections, such as source code for large software systems. However, no version control system currently supports such functionality for the office documents.

An office document can go through different modifications during its lifetime and can be developed by multiple technical or non-technical users. It might be desirable to know how the document came to its final stage and to sometime retrieve older versions of the document or merge two different versions of a document without manual effort.

This thesis work explains how we could implement versioning support for LibreOffice documents without using additional infrastructure for version repositories. Since embedding versioning data within the office document can indeed make version control a seamless part of the writing process. Such a modified document with embedded versioning data is called a version aware document.

A versioning framework has been developed previously at UWM that provides this versioning functionality for version aware XML documents by calculating the reverse deltas between revisions. A Version Aware XML document integrates full versioning functionality into an XML document type, using XML namespaces to avoid document type errors. Version aware XML documents contain a preamble with versions stored in reverse delta format, plus unique ID attributes attached to the nodes of the documents. They support the full branching and merging functionalities familiar to software engineers, in contrast to the constrained versioning models typical of Office applications.

LibreOffice is a free open source office suite that is widely used for document creation and branched off from OpenOffice in 2010. It is managed by "The Document Foundation" and includes application for text documents, spreadsheets, presentations, drawings and database. Each document is represented in the Open Office Document Format (ODF), which is a collection of XML files.

The current project is an endeavor to show the practicality of the version aware XML documents approach by modifying the LibreOffice document suite to support version awareness. It is necessary to understand the architecture of LibreOffice application as well as the document load and save cycles, the XML element and attribute processing, the class hierarchies and the internal data structures. We have modified the source code of the LibreOffice Writer application to accept and preserve the required changes.

Share

COinS