Skip to content

2026-05-05 Triplelite in C

arcangelo7
arcangelo7Apr 27, 2026 · opencitations/ramose

refactor(skgif): use full URIs as product identifiers instead of OMID shorthand

+29-3600c8ee2
arcangelo7
arcangelo7Apr 27, 2026 · opencitations/ramose

feat: add #default_format field to override CSV default per operation

+133-6fffea47
arcangelo7
arcangelo7Apr 28, 2026 · opencitations/ramose

feat: add #custom_params field for addon-handled query parameters

Allow operations to declare custom query string parameters that are processed by addon functions instead of the built-in pipeline. Each parameter specifies a handler, processing phase (preprocess or postprocess), and description.

Preprocess handlers generate SPARQL fragments injected via [[name]] placeholders. Postprocess handlers transform the result table after built-in filters. When a custom parameter name collides with a built-in (filter, sort, require), the built-in behavior is disabled.

+383-214320f54
arcangelo7
arcangelo7Apr 28, 2026 · opencitations/ramose

feat(skgif): expand product filter with contributor and type criteria

Support filtering by contributor attributes (family name, given name, ORCID, identifier scheme, local identifier, organization name) and by product type.

+196-90f805f4b
arcangelo7
arcangelo7Apr 28, 2026 · opencitations/ramose

test(skgif): validate converter output against SHACL shapes

Also, the converter now normalizes partial dates to full dates before output.

+42-3b7adb10
arcangelo7
arcangelo7Apr 29, 2026 · opencitations/ramose

feat(skgif): add citation filters via directive injection into query templates

Placeholders can be placed anywhere in the #sparql block, and the engine resolves them before checking for @@ directives.

Four new citation filters (cf.cites, cf.cited_by, cf.cites_doi, cf.cited_by_doi) use this mechanism to federate across Meta and Index endpoints.

+420-2095e159cb
arcangelo7
arcangelo7Apr 29, 2026 · opencitations/ramose

fix(api_manager): return 400 instead of 404 for invalid parameter values

When an operation exists but the parameter value doesn't match the expected type regex (e.g. a DOI where an ORCID is expected), the error now correctly reports an invalid parameter (400) rather than a missing operation (404). Empty parameters are also caught with a specific message.

Closes: #19

+57-7f26a6c0

L’ingestione di OpenAlex è andata out of memory a causa del db Redis per i contatori di dati e provenance che ha superato i 100GB in RAM. Non è più sostenibile, conviene tornare al vecchio sistema basato su file. Vediamo di quanto degradano le performance. Al massimo sostituisco i file con un db relazionale.

arcangelo7May 1, 2026 · opencitations/oc_meta

refactor!: replace redis counters with filesystem-based counter handler

+249-544a53d21b

Posso fare come ho sempre fatto

?br_uri datacite:hasIdentifier ?id .
?id literal:hasLiteralValue "9781402096327" .

Oppure posso usare un blank node

?br_uri datacite:hasIdentifier [ literal:hasLiteralValue "9781402096327" ] .
arcangelo7
arcangelo7May 1, 2026 · opencitations/triplelite

build: add initial _core.c with primitives (StringArray, RDFTermArray)

Also switch from hatchling to meson-python build system to support C compilation

+165-861505fb
arcangelo7
arcangelo7May 2, 2026 · opencitations/triplelite

feat: add a chained hashmap (djb2)

+163-6381137f5
arcangelo7
arcangelo7May 2, 2026 · opencitations/triplelite

feat: add RDFTerm hashmap and string/RDFTerm interners

+166-55e8256a
arcangelo7
arcangelo7May 2, 2026 · opencitations/triplelite

feat: add integer set and SPO triple index

Open-addressing IntSet with linear probing

+378-02a40d65
arcangelo7
arcangelo7May 3, 2026 · opencitations/triplelite

feat: implement TripleLite C extension with Python bindings

Expose the C engine as a CPython type with add, remove, triples, objects, predicate_objects, subjects, and has_subject methods. Supports len, contains, and iter via a custom iterator that walks the SPO index.

Add memory ownership throughout: strdup in hashmap/dynarray/intset, deep-copy for RDFTerm, and corresponding free functions for all data structures.

+904-151606d2d4
OperazionePythonCRapporto
add_many1.05M triple/s19.8K triple/sC 53× piu lento
add_single991K triple/s17.9K triple/sC 55× piu lento
predicate_objects1.14 μs/chiamata819 μs/chiamataC 719× piu lento
subjects0.75 μs/chiamata156 μs/chiamataC 209× piu lento
objects0.78 μs/chiamata990 μs/chiamataC 1.269× piu lento
has_subject0.14 μs/chiamata71.5 μs/chiamataC 502× piu lento
contains0.50 μs/chiamata143 μs/chiamataC 286× piu lento
full_scan4.56M triple/s2.05M triple/sC 2.2× piu lento
PythonCRapporto
373.0 byte/tripla295.2 byte/triplaC 1.3× in meno
arcangelo7May 3, 2026 · opencitations/triplelite

perf: add dynamic resizing to all hash tables and optimize query lookups

+265-39d266629
arcangelo7May 3, 2026 · opencitations/triplelite

perf: replace chained hash tables with open-addressing

+708-819b0bc371
OperazionePythonCRapporto
add_many1.07M triple/s2.16M triple/sC 2.0× più veloce
add_single1.00M triple/s1.24M triple/sC 1.2× più veloce
predicate_objects1.38 μs/chiamata2.13 μs/chiamataC 0.7×
subjects0.76 μs/chiamata0.80 μs/chiamataC 0.9×
objects0.79 μs/chiamata1.01 μs/chiamataC 0.8×
has_subject0.14 μs/chiamata0.14 μs/chiamataC 1.0×
contains0.42 μs/chiamata0.52 μs/chiamataC 0.8×
subgraph1.58 μs/chiamata3.80 μs/chiamataC 0.4×
full_scan5.68M triple/s2.47M triple/sC 0.4×
PythonCRapporto
373.0 byte/tripla167.6 byte/triplaC 2.2× in meno
arcangelo7May 3, 2026 · opencitations/triplelite

ci: add cross-platform wheel builds

Replace single-platform uv build with cibuildwheel for building wheels across Linux (x86_64, aarch64, musl), macOS (x86_64, arm64), and Windows (AMD64, ARM64) for Python 3.10-3.13.

Add multi-OS matrix to test workflow.

+88-110bb61b5
  • In Matilda ci sono tante risorse identificate solo dall’id arXiv, non riconciliato al DOI della versione pubblicata. Va bene così?
  • SKG-IF impose ISO 8601 datetime per le date. Significa che ci vuole anche ora, minuti e secondi. Va bene così?
  • https://github.com/opencitations/ramose/issues/2?
  • Ilaria
  • Peffomance? https://swsa.semanticweb.org/content/swsa-distinguished-dissertation-award
  • Articolo ISWC
  • Elia, quando hai corretto la provenance hai verificato che tutte le entità nei dati avessero degli snapshot di provenance? Gli OMID consecutivi 0624010177378–0624010177388 e 0624010177865–0624010177868 non hanno snapshot di provenance. Sono vecchi perché non uso più prefissi diversi da 060 da due dump.
  • w3id