Zu Hauptinhalt springen

Nautilus - Evolution Tool for Graph Databases

Graph databases are schemaless, allowing flexible storage of interconnected data. When updating, manipulating or changing data, this can lead to heterogeneity. This is caused by implicit structural changes with so-called evolution operations such as add, rename, delete, transform, merge, copy, split or move.

To describe these implicit structural changes, we developed a domain independent evolution language named GEO - Graph Evolution Operation - to specify how each evolution operation works. Through its intuitive syntax, GEO is proclaimed to be used by both experts and non-experts. Consequently, the presence of GEO at all levels is utmost important to support potential users.



Research Questions

  1. How can implicit structural changes be described for graph databases in general?
    • Through our domain independent evolution language named GEO
    • GEO was extended by the graph-specific operation transform, showing the complexity of interconnected data in graph databases
  2. How can graph database evolution be made accessible to a wide range of users, ensuring an easy usage?
    • A first version of Nautilus has been implemented


Contribution

Nautilus benefits from the naturalish language GEO. Consequently, we plan on implementing the evolution language to widen the range of users for graph databases in general, aiming to ease the usage of graph databases e.g. in interdisciplinary research projects. The novelty, moreover, is given through the former lack of an evolution language including graph-specific operations like transform. As some evolution operations, such as splitting nodes or relationships at a specified property key, are neither in Neo4j's query language Cypher nor in the APOC library available, a workaround is needed. By the means of Schema Modification Operations (SMO), enabling a precise translation into a workaround with Cypher, even such operations are available in Nautilus. Users, therefore, benefit from an easy-to-understand language to execute complex workarounds on their database.


Current Status and Next Steps

Nautilus represents a first implementation of an evolution-approach, on base of which we plan an interactive interface to visualize the schema as well as structural profiles. Therefore, that users can not overlook the impact of evolution in the sense of:

  • How many entity types (nodes, relationships) or features (label, type, property key) will be affected (none, some, which)?

The user interface of Nautilus offers an interactive scatter plot, illustrating structural database statistics (SDS) for this. SDS are a hybrid approach consisting of schema information together with database statistics to visualize which data currently is stored in the database. In addition, the SDS before and after the evolution process are compared to one another.

Next steps in the development are solutions for the following questions: 

  • Will an operation result in a relaxed schema? Are there already optional elements?
  • What kind of impact has an evolution operation on the schema of one's graph database, and is this output intended?


Publications


  1. Fakultät für Informatik und Data Science

Lehrstuhl Data Engineering

Dominique Hausler


Telefon: 0941 943-68609

E-Mail: dominique.hausler@ur.de

Raum: 626