Morph-KGC is an engine that constructs RDF knowledge graphs from heterogeneous data sources with the R2RML and RML mapping languages. Morph-KGC is built on top of pandas and it leverages mapping partitions to significantly reduce execution times and memory consumption for large data sources.
Features
- Supports the R2RML and RML mapping languages.
- User-friendly mappings with YARRRML.
- Transformation functions with RML-FNML, including Python user-defined functions.
- RDF-star generation with RML-star.
- RML views over tabular data sources and JSON files.
- Integration with RDFLib and Oxigraph.
- Optimized to materialize large knowledge graphs.
- Remote data and mapping files.
- Input data formats:
- Relational databases: MySQL, PostgreSQL, Oracle, Microsoft SQL Server, MariaDB, SQLite.
- Tabular files: CSV, TSV, Excel, Parquet, Feather, ORC, Stata, SAS, SPSS, ODS.
- Hierarchical files: JSON, XML.
- In-memory data structures: Python Dictionaries, DataFrames.
- Cloud data lake solutions: Databricks.
Organizations That Use Morph-KGC
Licenses
Morph-KGC is available under the Apache License 2.0.
The documentation is licensed under CC BY-SA 4.0.
Author
Ontology Engineering Group, Universidad Politécnica de Madrid.
Citing
If you used Morph-KGC in your work, please cite the SWJ paper:
@article{arenas2022morph,
title = {{Morph-KGC: Scalable knowledge graph materialization with mapping partitions}},
author = {Arenas-Guerrero, Julián and Chaves-Fraga, David and Toledo, Jhon and Pérez, María S. and Corcho, Oscar},
journal = {Semantic Web},
year = {2022},
doi = {10.3233/SW-223135}
}