Exporting Data
TopBraid EDG supports the export of metadata and data to JSON-LD, RDF/XML, N-Triples, Turtle, Turtle+, TriG, JSON, CSV, TSV and XML through pre-built functions, a SPARQL endpoint and a GraphQL endpoint.
Search and the SPARQL Query and SPARQL Results panels support the export of subsets of the data in an asset collection based on custom criteria and sorting. These panels provide fine-grained control over the selection of data. Menus in these panels offer several choices to export the results into spreadsheet-compatible formats (e.g., for Excel) as well as other formats.
See also
See Search Panel and Using SPARQL to Query and Modify Data and SPARQL Results Panel.
Bulk export to AWS S3 can be done through the Basket (see Bookmarking Asset Collections and Assets).
Export Collection as an RDF File
Export <collection type> as an RDF File is available under the Export tab.
These operations support exporting a collection’s data in a standard RDF serialization format.
Sorted Turtle includes an extension that supports the TopBraid reification capabilities. Reified statements in EDG are converted to standard RDF Statements in the exported file.
Sorted Turtle+ is an extension to the standard Sorted Turtle that supports TopBraid’s reification capabilities known as “RDF-star” or “RDF*”. In this case, the serialization is Turtle+ and requires a Turtle+ parser in order to import the file.
See also
RDF-star is not yet an official W3C standard, but work on the specification is well under way. TopQuadrant is one of the contributors to the specification document. In the meantime, many of the RDF technology vendors, including TopQuadrant, already implemented support for this concept.
Browser interactions during export vary: the data may be directly displayed, via a kind of view source command or the browser might provide (on the link) a right-click menu option to save the link target to a file, without first displaying the link result in the browser,. prompting for the file location and name.
The export of multiple asset collections is supported by placing them into the Basket and then selecting Export to S3 option.
This requires that at least one S3 bucket is configured by the EDG Administrator. Once set up, select a bucket using Select S3 Bucket for Exports under the Manage tab. The export will run in the background and, once exported, files will be saved to the S3 bucket. This option supports specifying that the exported files be compressed.
Export Collection with Includes as a File
Export <collection type> with Includes as a File is available under the Export tab. Two formats for export with includes are available, TriG and Zip File.
With inferences options will add a dedicated graph named urn:x-topbraid:inferences
, which has any triples inferred via SHACL or SPIN rules.
Note
Inferences are computed on-the-fly and therefore the export may be slow.
Publish for Explorer Users
On an EDG server that is paired with a TopBraid Explorer Server (for read-only access), managers can publish it to the Explorer for viewing.
Note
The working copies of all published asset collections might or might not be viewable, depending on the Explorer’s administrative configuration.
Any manager of an asset collection can control its Explorer publication status by selecting Export > Publish Glossary for Explorer Users. The view shows a Status drop-down for the asset collection, which indicates whether the asset collection was ever Published or not (Unpublished).
It also lists any included asset collections that might also require publication.
Ensure that all included graphs are either already present on the Explorer server or published along with the asset collection. Changing the status causes the following action.
Current Status |
Chosen Option |
Result |
---|---|---|
Unpublished |
Published |
Sends a copy of the asset collection and selected includes to the Explorer server. Changes the source collection’s status to Published. |
Published |
Update Published Copy |
Re-sends a current copy and selected includes to the Explorer server, overwriting the previous version(s). Keeps the source collection’s status as Published. |
Published |
Unpublished |
Deletes the asset collection on the Explorer server. Changes the source collection’s status to Unpublished. |
GraphQL Queries
This options allows users of the collection to retrieve and modify asset collection data using new or saved GraphQL queries.
See GraphQL for details.
Asset Collection Specific Exports
These exports are only available for specific collection types
Export Hierarchy Spreadsheet
Export Hierarchy Spreadsheet is available under the Export tab only for Taxonomies collections. It outputs an entire taxonomy tree in a spreadsheet compatible format.
Export Concept Overview Spreadsheet
Export Concept Overview Spreadsheet is available under the Export tab only for Taxonomies collections. It outputs all taxonomy concepts in a spreadsheet compatible format. For each concept, this output includes every available property as a spreadsheet column.
Export Ontology as RDF Schema File
Export Ontology as RDF Schema File is available under the Export tab only for Ontologies collections. It produces a simple, approximated RDF Schema version of the current ontology in SHACL. The output in in Turtle format.
Export Ontology as OWL File
Export Ontology as OWL File is available under the Export tab only for Ontologies collections. It produces a simple, approximated OWL version of the current ontology in SHACL. The output in in Turtle format.
Export Crosswalk as Spreadsheet
Export Crosswalk as Spreadsheet is available under the Export tab only for Crosswalks collections. It creates a comma-separated spreadsheet containing one row for each mapping in a Crosswalk.
Export Avro JSON
Export Avro JSON is available under the Export tab only for Data Assets collections. It creates one or more Avro files in JSON format for database tables and allows users to select the tables to export.
Normalized Concepts
Normalized Concepts (Troubleshooting) is available under the Export tab only for Content Tagsets collections. It generates a normalized version of the Tagging vocabulary used in a Content Tag Set, as it would be seen by AutoClassifier (useful for troubleshooting). The output is in Turtle format.
Export to S3
Setup
From Server Administration in EDG, configure your S3 bucket in External System Integration Management.
Choose authentication type appropriate for your organization. https://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/credentials.html#credentials-default
Test the connection after saving.
In EDG Configuration, select the bucket for the default location for S3 storage for your EDG application and save.
Using the S3 Export
S3 Export uses the Basket feature of EDG.
Add your collection(s) to the Basket:
Navigate to the Basket:
Check the collections you want to export and select Export to S3:
Version will be added from the version metadata in the collection automatically. Change any settings here and then submit:
The export will run in the background. An Administrator can check on the status through the Server Administration – Scheduled Jobs page. Once the job completes, it is no longer listed on this page. Success or failure is logged to Tomcat logs and also notifications sent (if enabled).
If email is configured on the EDG application, the user who submits the export will get an email notification when it is complete.
The folders are automatically created in S3 for the type of collection that is being exported.