2026-02-20 Slides pre-difesa
La Novitade
Section titled “La Novitade”Difesa
Section titled “Difesa”https://defence.arcangelomassari.com
time-agnostic-library
Section titled “time-agnostic-library”refactor!: replace python-dateutil with datetime.fromisoformat
BREAKING CHANGE: date/time values must now be in ISO 8601 format. Non-ISO formats (e.g., "May 21, 2021") are no longer accepted.
build!: adapt to time-agnostic-library v6.0.0
The library now returns N3-encoded string tuples instead of RDFLib Graph/Dataset objects from get_history() and get_state_at_time(). Add converter functions at the boundary to restore RDFLib objects for downstream code. Remove cache_endpoint and cache_update_endpoint parameters dropped from generate_config_file().
BREAKING CHANGE: requires time-agnostic-library >= 6.0.0
feat(benchmark): add disk usage tracking and per-query memory measurement
Replace resource.getrusage with tracemalloc for per-query peak memory tracking. Add timestamped run files with query-level resume support. Record OCDM, QLever, and OSTRICH disk usage from setup scripts. Add storage comparison and memory comparison plots to analysis.
HERITRACE
Section titled “HERITRACE”feat: replace Flask dev server with Gunicorn
Use Gunicorn as WSGI server in both development and production. Workers and timeout are configurable via GUNICORN_WORKERS and GUNICORN_TIMEOUT env vars, defaulting to (2 * CPU + 1) workers. Dev environment generates self-signed SSL certs and runs with --reload.
build: migrate from Poetry to uv
Aldrovandi
Section titled “Aldrovandi”feat: add SHACL validation of generated metadata against CHAD-AP shapes
Validate each stage's metadata against SHACL shapes during folder processing and report non-conforming stages at the end. Add pyshacl dependency and type annotations to generate_provenance_snapshots.
feat(zenodo): add CC0 Italian cultural heritage law disclaimer
Append a disclaimer about Italian cultural heritage regulations to Zenodo descriptions for CC0-licensed content. Fix license identifier assertion and add not-None guards in zip tests.
feat(zenodo): add keeper institution and location to record description
Extract curation activity data from the knowledge graph following the CHAD-AP ontology pattern (crm:E7_Activity with aat:300054277) to include the conserving institution and its location in each Zenodo record description.
feat(zenodo): add file scope description to license rights entries
Metadata license (CC0) now explicitly lists meta.ttl and prov.trig. Content license describes coverage as all data files except those two.
Zenodo non usa CREDIT, usa https://datacite-metadata-schema.readthedocs.io/en/4.6/appendices/appendix-1/contributorType/
feat(zenodo): convert config format to InvenioRDM API schema
Restructure creators with person_or_org/role/affiliations format, split family_name/given_name fields, add datacollector/datacurator roles, convert related_identifiers and locations to InvenioRDM nested format, and add optional SHACL validation skip flag.
https://sandbox.zenodo.org/records/442870
feat(zenodo): generate entity-to-DOI association table after upload
Computational management of data
Section titled “Computational management of data”refactor: move D&C and file handling labs from part 3 to new part 5
Move divide-and-conquer exercises from lab-06 to lab-07 and file handling content from lab-07 to new lab-08, creating a dedicated Part 5 for these topics. Remove CSV and JSON sections from the file handling lab.
feat: add part 7 laboratories on pandas and Python classes
Add lab-09 (pandas exercises with Caravaggio dataset) and lab-10 (Python classes for Baroque painters). Include CSV dataset files for the pandas lab. Update _toc.yml to include new Part 7 and renumber Databases to Part 8. Add clean step to build scripts.
https://thinkcompute.github.io/
oc-meta
Section titled “oc-meta”feat(finder): add merged entities reconstruction from provenance
Add tool to scan provenance files and reconstruct merge chains. The script identifies entities that were merged by detecting multiple wasDerivedFrom references in provenance snapshots, then follows the chain to find the final surviving entity.
Usage: python -m oc_meta.run.find.merged_entities -c
Domande
Section titled “Domande”- La data del libro di Computational Management of Data va aggiornata tutti gli anni al nuovo anno? Perché vedo che c’è ancora il 2025 per la citazione e il 2023 nel footer. Dovremmo tenere aggiornate queste date sempre all’anno in corso o comunque alla data di ultimo aggiornamento?
- comp-think