Skip to main content


Democratising and making sense out of heterogeneous scholarly content

SciLake builds upon the OpenAIRE ecosystem and EOSC services to enable creation, interlinking, and maintenance of Science Knowledge Graphs (SKGs) and execution of data science and graph mining queries on top of them unlock the vast scientific knowledge space with advanced, AI-based services that exploit customized perspectives.
Identify and address domain-specific cross-disciplinary information needs while managing heterogeneous scholarly content SciLake is building a scientific data lake to store, interlink, and analyze both discipline-specific and cross-domain Science Knowledge Graphs, as well as raw, unstructured scholarly content. 
Democratize scholarly content by facilitating, interlinking, and managing of community-based SKGs
SciLake will develop and make openly available a toolkit for acquiring raw and unstructured scholarly content and for creating, interlinking, and managing SKGs. The toolset, in combination with mechanisms that facilitate community management, will pave the way for the democratization of scholarly content, giving researchers full control. Based on the Open AIRE data space, an open, transparent, and comprehensive Scientific Lake will be deployed and maintained under SciLake. Identify research trends and valuable research objects
SciLake will provide AI-assisted services for identifying research trends and valuable research objects (publications, datasets, software, etc.) by applying advanced techniques for automated evaluation of research impact according to various perspectives (different aspects of scientific impact, but also societal or economic impact).
Assess research reproducibility and replicability/repeatability SciLake will develop a set of AI-assisted services to assess research reproducibility (i.e., whether the work is ready to be reproduced) and replicability/repeatability (i.e., how many times the findings have been replicated). The services will leverage the Scientific Lake content and will address technical challenges related to identifying research objects and missing links among them in full-text publications, understanding the citation context, even in non-English texts, and other domain-specific issues.
Customise, test, and demonstrate developed services in real-world scenarios
Four research community pilots in the fields of neuroscience, cancer, transportation, and energy research participate in the testing, validation, and evaluation of SciLake services that reflect their needs. With its pilots, SciLake seeks to expand EOSC services to areas currently underrepresented in EOSC, such as Transport and Energy research, which are not part of large research infrastructures.
Leverage & enrich EOSC services
Scilake will build upon EOSC functionalities (e.g., complying with the EOSC interoperability framework for monitoring, accounting, and AAI) and will integrate its open-source services into the portfolio of EOSC core services.

Start Date
End Date
Vergoulis Thanasis

Scientific Director