Tags: cre-dev/xml2db
Tags
Fixes and docs (#73) * Fix inserted rows count to avoid returning 0 when the count is not returned by the backend * Add zooming capability in `xml2db serve` * Write documentation for CLI usage * Several documentation editing --------- Co-authored-by: Claude <noreply@anthropic.com>
Fix DuckDB bulk_insert failing when quoted CSV cells fall outside sni… …ffer sample (#64) * Fix DuckDB bulk_insert failing when quoted CSV cells fall outside sniffer sample (#62) DuckDB's read_csv sniffer only examines the first ~20 480 rows; if none are quoted, it sets quote=(empty) and then errors on any later quoted cell with a column-count mismatch. Passing quote='"' explicitly bypasses auto-detection. Adds a regression test that inserts 25 000 plain rows followed by one row whose value contains a comma (triggering csv.writer quoting). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Bump version to 0.13.3 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Add explicit escape='\"' to read_csv alongside quote='\"' Makes the RFC 4180 doubling behaviour explicit rather than relying on DuckDB defaulting escape to the same char as quote. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Fix KeyError when xs:choice branches share an element with same name … …and type (#60) * Fix KeyError when xs:choice branches share an element with same name and type In add_relation_1/add_relation_n, skip duplicate relations where a field with the same name and same XSD type already exists. Previously, self.fields (a list) accumulated both occurrences while self.relations_1/n (dicts) only kept the last, causing a KeyError in simplify_table when the first occurrence was elevated and the second tried to look it up. Add deliveryType to the orders crash-test schema to cover this case, with order3.xml exercising both the sequence branch (from+to) and standalone branch (to). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Bump version from 0.13.1 to 0.13.2 --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Document and test multiprocessing pattern for concurrent XML loading (#… …58) Adds a test_multiprocessing.py test (parallel parse + serialised DuckDB writes via multiprocessing.Lock, with XML roundtrip content assertion) and a matching example in the API overview docs. Each worker creates its own DataModel with a unique temp_prefix so temp tables never collide. Bumps version to 0.13.1.
Add DatabaseDialect and implement identifiers truncation (#54) * Added a DatabaseDialect class hierarchy (base, postgresql, mssql, mysql, duckdb), one subclass per backend, to abstract db specific logic and avoid conditionnals scattered in the code * Implement truncation of database identifiers, because long names was an issue frequently found especially with postgresql. Each db dialect sets its own limit * Remove Python 3.9 support (EOL) and add Python 3.14 to the test matrix * Bump version --------- Co-authored-by: cre-os <opensource@cre.fr>
Move hashing after conversion (#22) Changed the way hashes are computed in order to compute them after converting data types, so that data which is at the identical in the database result in identical hashes. --------- Co-authored-by: cre-os <opensource@cre.fr>
PreviousNext