Graphenus integrates different machine learning libraries to allow data science teams to develop any type of model, abstracting them from the complexity of distributed data management (infrastructure, configuration, processing, etc.).

Ease of use

  • Support for multiple programming languagesJava, Python, Scala, R.
  • Out-of-the-box integration with any source of information supported by Graphenus.
  • Abstraction of distributed computing complexity.

High performance and scalability

  • Execution of the machine learning models and algorithms on Spark.
  • Availability of multiple models optimised.
  • Guaranteed scalability thanks to Graphenus and its architecture of containers.

Broad ecosystem

  • High volume of ML algorithms available (classification, regression, recommendation, clustering, etc).
  • Utilities type workflow for feature transformation, pipeline definition, model evaluation, persistence, etc.
  • Neural network models, genetic algorithms, tensorsetc.

Graphenus has a solid foundation to support the development of machine learning models.

Building on the available Spark ML base, new libraries are added to provide Graphenus with differential ML capabilities:

  • Graphenus is fully compatible with all major ML libraries: Scikit Learn, Pandas, TensorFlow, PyTorch, Mlflow and Spark MLib.


  • Thanks to Spark, ML models can be run on a fully distributed basisusing virtually any data source and in a way that is transparent to the developer.
  • Federated Learning: Machine learning technique that allows a model to be trained using multiple processing nodes with their own local dataset, without the data leaving the nodes. In other words, it allows the model to be obtained in a decentralised manner, avoiding concentrating all the data in the same location. Thus, the data flow is only composed of the parameters that make up the model, never by the local data of each node, which makes it particularly useful when the data is sensitive or private.