Warning
This project is currently in the Alpha stage. APIs and internal structures may change significantly between versions. Use with caution in production environments.
document-dataply is a pure JavaScript high-performance document-oriented database library built on top of the dataply record storage engine. It is designed to handle millions of rows with high stability, providing a structured way to store, index, and query JSON-style documents.
- Document-Oriented: Store and retrieve JSON-style documents.
- B+Tree Indexing: Supports high-performance lookups using a B+Tree indexing engine.
- Deep Indexing: Index nested object fields and specific array elements (e.g.,
user.profile.nameortags.0). - Flexible Indexing Policies: Supports full re-indexing for existing data or incremental indexing for future data.
- ACID Transactions: Reliable atomic operations with WAL (Write-Ahead Logging) and MVCC (Multi-Version Concurrency Control) support.
- Modern Architecture: Fully supports Async/Await and Streaming, making it ideal for modern high-concurrency server environments.
- Rich Querying: Supports comparison operators (
lt,gt,equal, etc.) and pattern matching (like).
Built with pure JavaScript, document-dataply can be used in various environments:
- Official Support: Node.js, Electron, NW.js
- Experimental Support: Deno, Bun
Supports standard JSON data types:
string,number,boolean,null- Nested
objectandarray
npm install document-dataplyimport { DocumentDataply } from 'document-dataply';
type MyDocument = {
name: string;
age: number;
tags: string[];
}
async function main() {
const db = DocumentDataply.Define<MyDocument>()
.Options({ wal: 'my-database.wal' })
.Open('my-database.db');
// Initialize database
await db.init();
// Register indices
// use transaction to ensure atomicity
await db.migration(1, async (tx) => {
await db.createIndex('name', { type: 'btree', fields: ['name'] }, tx);
await db.createIndex('tags_0', { type: 'btree', fields: ['tags.0'] }, tx);
// Composite Index support
await db.createIndex('idx_name_age', { type: 'btree', fields: ['name', 'age'] }, tx);
console.log('Migration completed successfully');
});
// Insert document
const id = await db.insert({
name: 'John Doe',
age: 30,
tags: ['admin', 'developer']
});
// Query document
const query = db.select({
name: 'John Doe', // Shortcut for { name: { equal: 'John Doe' } }
age: { gte: 25 }
})
// Get all results
const allResults = await query.drain();
// Or iterate through results
for await (const doc of query.stream) {
console.log(doc);
}
console.log(allResults);
// Close database
await db.close();
}
main();document-dataply supports creating indices at any time—whether before or after the database is initialized.
- Pre-Init: Creating an index before
db.init()ensures that the database is ready with all necessary structures from the start. - Post-Init: You can call
db.createIndex()even after the database is already running. The library will automatically create the index and perform backfilling (populating the index with existing data) in the background.
// Create a new index on an existing database
await db.createIndex('idx_new_field', { type: 'btree', fields: ['newField'] });You can create an index on multiple fields. This is useful for optimizing queries that filter or sort by multiple criteria.
await db.createIndex('idx_composite', {
type: 'btree',
fields: ['category', 'price', 'status']
});The sorting is performed element-by-element in the order defined in the fields array. If all values are equal, the system uses the internal _id as a fallback to ensure stable sorting.
To efficiently insert multiple documents, use the following:
const ids = await db.insertBatch([
{ name: 'Alice', age: 25, tags: ['user'] },
{ name: 'Bob', age: 28, tags: ['moderator'] }
]);document-dataply supports powerful search capabilities based on B+Tree indexing.
| Operator | Description |
|---|---|
lt, lte, gt, gte |
Comparison operations |
equal, notEqual |
Equality check |
like |
Pattern matching |
or |
Matching within an array |
match |
Full-text search (Requires FTS Index) |
For detailed operator usage, index constraints (including full scans), and sorting methods, see the Query Guide (QUERY.md).
Important
Full-Text Search (match): To use the match operator, you must configure the field as an FTS index (e.g., { type: 'fts', tokenizer: 'whitespace' }). Standard boolean indices do not support match. See QUERY.md for details.
Ensure data integrity with ACID-compliant transactions. Use commit() and rollback() to process multiple operations atomically.
For detailed usage and error handling patterns, see the Transaction Guide (TRANSACTION.md).
document-dataply provides flexible ways to update or delete documents based on query results. All these operations are Stream-based, allowing you to handle millions of records without memory concerns.
- Partial Update: Modify only specific fields or use a function for dynamic updates.
- Full Update: Replace the entire document while preserving the original
_id. - Delete: Permanently remove matching documents from both storage and indices.
For details on streaming mechanisms and bandwidth optimization tips, see the Stream Guide (STREAM.md).
As your document structure evolves, you can use the migration() method to safely update your database. This method uses a schemeVersion to track which migrations have been applied.
await db.migration(1, async (tx) => {
// Add a new index for an existing database
await db.createIndex('age', { type: 'btree', fields: ['age'] }, tx);
});For more details on handling database evolution, see the Migration Guide (MIGRATION.md).
For more information on performance optimization and advanced features, see TIPS.md.
- Query Optimization: Automatic index selection for maximum performance.
- Sorting and Pagination: Detailed usage of
limit,orderBy, andsortOrder. - Memory Management: When to use
streamvsdrain(). - Performance: Optimizing bulk data insertion using
insertBatch. - Indexing Policies: Dynamic index creation and automatic backfilling.
- Composite Indexes: Indexing multiple fields for complex queries.
Registers or creates a named index. Can be called at any time.
options:{ type: 'btree', fields: string[] }or{ type: 'fts', fields: string, tokenizer: ... }.tx: Optional transaction.- Returns
Promise<this>for chaining.
Removes a named index from the database.
name: The name of the index to drop.tx: Optional transaction.- Returns
Promise<this>for chaining. - Note: The internal
_idindex cannot be dropped.
Initializes the database and sets up system-managed indices. It also triggers backfilling for indices registered before init().
Runs a migration callback if the current schemeVersion is lower than the target version.
version: The target scheme version (number).callback: An async function(tx: Transaction) => Promise<void>.tx: Optional transaction.
Inserts a single document. Each document is automatically assigned a unique, immutable _id field. The method returns this _id (number).
Inserts multiple documents efficiently. Returns an array of _ids (number[]).
Searches for documents matching the query. Passing an empty object ({}) as the query retrieves all documents.
Returns an object { stream, drain }.
stream: An async iterator to traverse results one by one.drain(): A promise that resolves to an array of all matching documents.
Partially updates documents matching the query. newFields can be a partial object or a function that returns a partial object. Returns the number of updated documents.
Fully replaces documents matching the query while preserving their _id. Returns the number of updated documents.
Deletes documents matching the query. Returns the number of deleted documents.
Returns physical storage information and index metadata.
- Returns
Promise<{ pageSize, pageCount, rowCount, indices, schemeVersion }> indices: List of user-defined index names.schemeVersion: The current schema version of the database.
Returns a new Transaction object.
Flushes changes and closes the database files.
Automated benchmarks are executed on every push to the main branch and for every pull request. This ensures that performance regressions are detected early.
- Dataset: 10,000 documents
- Operations: Batch Insert, Indexed Select, Partial Update, Full Update, Delete
You can view the real-time performance trend and detailed metrics on our Performance Dashboard.
Tip
Continuous Monitoring: We use github-action-benchmark to monitor performance changes. For every PR, a summary of the performance impact is automatically commented to help maintain high efficiency.
MIT