OntoEditor Documentation

Overview • Architecture • Motivation • Implementation • Installation • Source Code • License • Contact

Overview:

Online Collaborative Ontology Editor (OntoEditor) on top of Distributed Version Control Systems is an approach towards supporting collaborative developing ontologies with syntax parsing in different RDF serialization formats, such as Turtle, Json-LD, and RDF/XML.

Motivation:

John, Robert, and Lisa, experienced ontology engineers, face challenges in collaborative ontology development. Current tools lack real-time collaboration, syntax checking, and efficient communication. To streamline this, we've devised a solution enabling real-time collaborative editing, live syntax checking, and instant communication.

Users work on a shared document, seeing each other's changes, cursor positions, and syntax errors in real-time. Discussions happen through live chat, resolving errors collaboratively. Only authorized users can commit changes to the remote repository. This approach simplifies collaboration, ensuring real-time error detection and synchronization with the remote repository.

OntoEditor Architecture and Workflow:

The architecture of OntoEditor consists of several components designed to serve as a Streamlined Visual KG Builder specifically tailored for novice users. The main components are can be seen below in the coming diagram.

Main Components:

Editor: CodeMirror as a JavaScript-based Editor was selected for its robust programmable API and advanced editing capabilities like auto-indentation, auto-completion, syntax highlighting, and search functionalities. As an open-source editor widely used in various projects, CodeMirror inherently supports syntax highlighting for over a hundred programming languages, including Turtle, XML, and JSON-LD. Crucially, it enables collaboration by detecting changes through onChange events
Real-Time Communication Channel: Real-time communication is vital for collaborative editing. WebSocket [6] technology enables immediate and bidirectional data exchange between web browsers (clients) and servers, facilitating seamless interactions. Upon initiating document editing, a WebSocket connection is established. This channel relays all modifications to the server, managing live chat, user details, cursor positions, and notifications among connected users.
ShareDB: To enable seamless collaboration, a real-time database was imperative. After careful research, ShareDB 2 emerged as the optimal choice. ShareDB, The below figure shows OntoEditor’s Workflow. The diagram illustrates the authentication process in Git, file selection, and the initiation of RDF editing. It highlights the collaborative nature facilitated by a unique shareable link, enabling simultaneous editing and syntax checking. Completed changes can be committed and pushed to the remote repository. built on Operational Transformation, operates as a real-time in-memory database. It stores JavaScript objects on the server and facilitates their sharing among multiple clients through WebSockets. Documents in ShareDB include properties like Version (incrementing from 0), Type (e.g., OT-text, OT-json1), and Data which is the intended content for storage within the database. Algorithm 1 was designed to operationalize this approach. Upon a user’s initiation of RDF document editing, a new document with an initial version of 0 and the RDF data to be inserted is created in ShareDB. If the document already exists in ShareDB, its existing path is returned to the user. For Operational Transformation, we leverage Plain text Operational Transformation. This Operational Transformation type is utilized for editing plain text documents and supports operations including skipping forward N characters, inserting str at the current position, and deleting N characters at the current position. Clients subscribe to ShareDB documents, updating the document’s state with insertions and deletions by modifying the index position and content. Each change increments the version number and is stored in ShareDB. These operations are transmitted over WebSockets, updating connected clients’ local document states.
RDF Validator: OntoEditor utilizes JavaScript parser libraries for real-time validation of var- ious RDF serialization formats (Turtle, RDF/XML, JSON-LD). Users receive instantaneous error messages and can rectify syntax errors seamlessly during editing. To validate RDF, established parsers are employed: N3.js for Turtle, RDF/XML streaming parser 2 for RDF/XML, and JSON-LD streaming parser for JSON-LD. These parsers operate in a streaming manner, ensuring efficient handling of large documents with limited memory. During the editing process, users can select their desired format. The chosen parser is activated accordingly, integrated with an onChange function to auto- matically check syntax while typing. The syntax checker can be toggled on or off, with default activation. The syntax checker identifies the format from the URL path and calls the corresponding parser. Parsing occurs in a streaming manner, providing parsed triples and highlighting any syntax errors. Meaningful error messages are displayed atop the editor for immediate user visibility. Upon error correction, a ’Syntax correct, all triples parsed successfully’ message is shown

OntoEditor Implementation:

Referencing the following Figure , OntoEditor has 3 modules. The first module is responsible for remote repository communication, the second module is responsible for enabling collaboration and finally the third module is responsible for syntax validation.

Repository Communication:

Given the requirement to provide a low-threshold access to the repository, we decided to prioritize the availability of a web interface over the underlying version ntrol system. GitHub, Gitlab and Bitbucket, these three repository hosting services provide a RESTFUL API architecture. By consuming REST API’s, we allow users to communicate directly with their repository. This is useful especially for those users who prefer editing RDF in a web based client and don’t want to install GIT in their local machines. In order to take advantage of OntoEditor, users have to authenticate themselves with these repository hosting services. The authentication can be based on the username, password or personal access token. Since with Github and Gitlab, the authentication is only possible with a personal access token as they do not provide support for the username or password authentication anymore. The Bitbucket users can still take advantage of both authentication methods. After authentication, we show users the list of their repositories and all the branches in the respective repository. A typical fetch request is shown in the below Fetching Script. . Here we call the Gitlab Rest API to get the projects/repositories of a user whose credentials are passed in headers of request.

                
                  //Fetching Script: Fetching repositories of user
                  fetch (" https :// gitlab .com/api/v4/ projects ? owned = true ", {
                    headers : { Authorization : " Bearer " + token } ,
                    })
                    . then ( function ( response ) {
                    if ( response .ok) {
                    response . json () . then (( data ) = > {
                    res . status (200) . json ({ repos : data }) ;
                    }) ;
                    } else {
                    res . status (400) . json ({ err : response . statusText }) ;
                    }
                    })
                    . catch ( function ( error ) {
                    res. status (400) . json ({ err : error . statusText }) ;
                    }) ;

The files are then filtered based on the file format. We will only show files with extensions (’.ttl’), (’.rdfxml’), (’.jsonld’), (’.rdf’), (’.json’) and (’.txt’). All the other files in the repository will be ignored. We allow the users to add new files in their repository if they want to start building an ontology from scratch. Further, a delete file option is also available if they want to delete any of their files. The coming Figure shows list of repositories and files returned by Github and how we display them in dropdown menus. Based on the user selection, we get the file and display it on our web frontend that they can use to start the editing process. After developing the ontology they can commit their changes to the repository. Users can share their editing link with other users but for committing changes, the authentication is always required and only those users will be able to commit, who have access to that respective repository.

A scenario can happen when user A edits the file from the repository directly. This will update the SHA1 of their file, and if user B is already editing on our editor, then there could be a GIT conflict. Since, while committing we completely push the new content which will replace the complete file on the repository. We wanted the user to be able to see if there are any new commits in their repository. For this, we keep checking the history of file in an interval of 60 seconds, and manifest it on the editor with the name of the last committer and the time of the last commit. This way, the user can know if there have been any new changes. Nevertheless, if the user wants to commit the file, we show them diff screen while making use of Mergely Javascript library. We then show the difference between the

two files and provide options to merge the content line by line or entirely. Inspired by how Visual Studio handles GIT conflict, we decided to implement this feature. The user can then see the latest file on the repository and the local editor state of the document in a side by side comparison. They can compare the parts to be merged or replaced before committing. As shown in the below Figure , user B was editing in our editor whilst user A pushed some new changes from Github directly. We then show a merge screen to user B where they can select the changes they want to keep or discard before committing them.

Collaboration

The collaboration service is implemented as a separate module which communicates over Websocket traffic rather than HTTP traffic. Due to a large number of editing operations, the idea was to extract the complexity of collaboration into an independent microservice.

RDF Validation and Error Reporting

we will be using JavaScript parser libraries for validating different RDF serialization formats. While starting the editing process, a user can select ”turtle”, ”rdf/xml”, or ”jsonld” format. Based on the user’s selection we will use a parser for that serialization format. Just like we used onChange function for insertion and deletion, we will also connect our syntax checker function with it. This will ensure that every time a user types something, the syntax is checked automatically. The user can choose to enable or disable the syntax checker, which by default is always activated. Syntax checker functions first detects the format from the URL path of the RDF serialization format. Contingent upon the format, we call the parser for that specific format. All the content in the document is then passed to our parser. Parser then parses the input in a streaming way, and returns us parsed triples and any syntax errors in the input. We display meaningful error messages on top of the editor in order that the user can read them anytime. As soon as the user corrects the error, a success message is displayed ”All triples are parsed, Syntax correct”.

Installation:

Requirements

Node.js for installing the development tools and dependencies from here.

OntoEditor Installation

Navigate to the root folder.
Run npm install to install the dependencies and build the project.
Run npm start
Then, OntoEditor GUI is accessible at http://localhost:5000/

Running Using Docker

You can also run OntoEditor using docker, If you have it installed on your machine, otherwise, you use this to install docker. Once you have docker, then you can issue the following command to download the OntoEditor docker image:

docker pull ahemid/ontoeditor

docker build .  -t ahemid/ontoeditor

Next, create the ontoeditor docker container using the following command:
```
docker run -d -p 5000:5000 -p 8080:8080 ahemid/ontoeditor 
```
Then, OntoEditor GUI is accessible at http://localhost:5000/