Skip to content

simplify dist by merging sql-asm-memory-growth.js into sql-asm.js#431

Open
kaizhu256 wants to merge 3 commits intosql-js:masterfrom
kaizhu256:merge.asm-memory-growth.into.asm
Open

simplify dist by merging sql-asm-memory-growth.js into sql-asm.js#431
kaizhu256 wants to merge 3 commits intosql-js:masterfrom
kaizhu256:merge.asm-memory-growth.into.asm

Conversation

@kaizhu256
Copy link
Member

Co-authored-by: Yan Li yanli0303@outlook.com
Co-authored-by: kai zhu kaizhu256@gmail.com

this implements suggestion #239 (comment)

removed or edited all references to memory-growth using command git grep -i 'memory.*growth'

Co-authored-by: Yan Li <yanli0303@outlook.com>
Co-authored-by: kai zhu <kaizhu256@gmail.com>
@lovasoa
Copy link
Member

lovasoa commented Oct 21, 2020

What is the performance impact ?

@kaizhu256
Copy link
Member Author

kaizhu256 commented Oct 22, 2020

good question. the following benchmark indicates asm2 is 1-6% slower running queries on a 60mb database:

// asm2 = LEGACY_VM_SUPPORT + ALLOW_MEMORY_GROWTH
// used build-artifacts from https://github.com/sql-js/sql.js/actions/runs/320372611 for asm2 binaries
// used build-artifacts from https://github.com/sql-js/sql.js/actions/runs/316181408 for asm, wasm binaries

loading 60mb database in chrome
sampled 10 times
asm2:   20.6 +/- 2.2 ms   ( 5% slower than asm)
asm :   19.7 +/- 1.9 ms
wasm:   15.4 +/- 0.3 ms   (22% faster than asm)

copy all tables in 60mb database in chrome
sampled 5 times
asm2:   15798 +/-  47 ms  (  1% slower than asm)
asm :   15610 +/- 152 ms
wasm:    3384 +/-  47 ms  (360% faster than asm)

fts-query with joins in chrome
sampled 5 times
asm2:   40464 +/- 115 ms  (  6% slower than asm)
asm :   38171 +/- 220 ms
wasm:   13942 +/- 216 ms  (170% faster than asm)

you can emulate above result in following steps:

image

  1. download 60mb enron-email database from https://github.com/kaizhu256/sqlite-enron

  2. load database into modified sql.js demos below:

  3. run following sql-queries used in benchmarks:

copy all tables in 60mb database

-- create database-schema
CREATE TABLE `employeelist2` (
  `eid` integer  NOT NULL PRIMARY KEY AUTOINCREMENT
,  `firstName` varchar(31) NOT NULL DEFAULT ''
,  `lastName` varchar(31) NOT NULL DEFAULT ''
,  `Email_id` varchar(31) NOT NULL DEFAULT ''
,  `Email2` varchar(31) DEFAULT NULL
,  `Email3` varchar(31) DEFAULT NULL
,  `EMail4` varchar(31) DEFAULT NULL
,  `folder` varchar(31) NOT NULL DEFAULT ''
,  `status` varchar(50) DEFAULT NULL
,  UNIQUE (`Email_id`)
);
CREATE TABLE `message2` (
  `mid` integer NOT NULL DEFAULT '0'
,  `sender` varchar(127) NOT NULL DEFAULT ''
,  `date` datetime DEFAULT NULL
,  `message_id` varchar(127) DEFAULT NULL
-- ,  `subject` text
-- ,  `body` text
,  `folder` varchar(127) NOT NULL DEFAULT ''
,  PRIMARY KEY (`mid`)
);
CREATE TABLE `recipientinfo2` (
  `rid` integer NOT NULL DEFAULT '0'
,  `mid` integer  NOT NULL DEFAULT '0'
,  `rtype` text  DEFAULT NULL
,  `rvalue` varchar(127) DEFAULT NULL
,  `dater` datetime DEFAULT NULL
,  PRIMARY KEY (`rid`)
,  FOREIGN KEY (mid) REFERENCES message2(mid)
);
CREATE TABLE `referenceinfo2` (
  `rfid` integer NOT NULL DEFAULT '0'
,  `mid` integer NOT NULL DEFAULT '0'
,  `reference` text
,  PRIMARY KEY (`rfid`)
,  FOREIGN KEY (mid) REFERENCES message2(mid)
);
CREATE INDEX "idx_message_sender2" ON "message2" (`sender`);
CREATE VIRTUAL TABLE messages_ft2 USING fts4(subject, body);

-- insert tables
-- employeelist2
-- message2
-- recipientinfo2
-- referenceinfo2
-- message_ft
SELECT 'employeelist', COUNT(*) FROM employeelist;
SELECT 'message', COUNT(*) FROM message;
SELECT 'recipientinfo', COUNT(*) FROM recipientinfo;
SELECT 'referenceinfo', COUNT(*) FROM referenceinfo;
SELECT 'messages_ft', COUNT(*) FROM messages_ft;

INSERT INTO message2 SELECT * FROM
    (
    SELECT
        message.mid,
        message.sender,
        message.date,
        message.message_id,
        -- message.subject,
        -- message.body,
        message.folder
    FROM
    (
    SELECT DISTINCT message.mid
    FROM message
    INNER JOIN recipientinfo ON recipientinfo.mid = message.mid
    INNER JOIN referenceinfo ON referenceinfo.mid = message.mid
    ) AS tmp1
    INNER JOIN message ON message.mid = tmp1.mid
    ORDER BY tmp1.mid
    LIMIT 10000
    ) AS tmp1;

INSERT INTO employeelist2 SELECT * FROM employeelist;
INSERT INTO recipientinfo2 SELECT
        tmp1.rid,
        tmp1.mid,
        tmp1.rtype,
        tmp1.rvalue,
        tmp1.dater
    FROM message2
    LEFT JOIN recipientinfo AS tmp1 ON tmp1.mid = message2.mid
    ORDER BY tmp1.rid;
INSERT INTO referenceinfo2 SELECT
        tmp1.rfid,
        tmp1.mid,
        tmp1.reference
    FROM message2
    LEFT JOIN referenceinfo AS tmp1 ON tmp1.mid = message2.mid
    ORDER BY tmp1.rfid;
INSERT INTO messages_ft2 (
    rowid, subject, body
)
SELECT
        tmp1.rowid, tmp1.subject, tmp1.body
    FROM message2
    INNER JOIN messages_ft AS tmp1 ON tmp1.rowid = message2.mid
    ORDER BY tmp1.rowid;
VACUUM;
SELECT 'employeelist2', COUNT(*) FROM employeelist2;
SELECT 'message2', COUNT(*) FROM message2;
SELECT 'recipientinfo2', COUNT(*) FROM recipientinfo2;
SELECT 'referenceinfo2', COUNT(*) FROM referenceinfo2;
SELECT 'messages_ft2', COUNT(*) FROM messages_ft2;
--*/

fts-query with joins

SELECT * FROM messages_ft2
INNER JOIN recipientinfo2 ON recipientinfo2.mid = messages_ft2.rowid
INNER JOIN referenceinfo2 ON referenceinfo2.mid = messages_ft2.rowid
WHERE body MATCH 'energy NEAR oil';
@kaizhu256
Copy link
Member Author

also the difference in size is < 1%:

                                 asm        asm2
sql-asm-debug.js          10,865,349  10,896,696  bytes
worker.sql-asm-debug.js   10,867,832  10,899,179  bytes
sql-asm.js                 2,383,575   2,389,992  bytes
worker.sql-asm.js          2,386,058   2,392,475  bytes
@lovasoa
Copy link
Member

lovasoa commented Oct 26, 2020

If we want to be able to merge this without breaking backwards compatibility, we can first add -s LEGACY_VM_SUPPORT=1 and -s ALLOW_MEMORY_GROWTH=1 into the asmjs version, and add a deprecation warning to the memory growth version.

On the next major version bump, we can delete the memory growth version.

@kaizhu256
Copy link
Member Author

k. would updated patch with following be ok?

image

image

@kaizhu256
Copy link
Member Author

latest-patch removes redundant npm-tests on sql-asm-memory-growth.js, since its now essentially a carbon-copy of sql-asm.js (plus a deprecation-warning header).

this simplifies sunsetting sql-asm-memory-growth.js in some future-release by simply deleting the following lines in /Makefile:

--- a/Makefile
+++ b/Makefile
@@ -101,9 +101,6 @@ dist/sql-wasm.js: $(BITCODE_FILES) $(OUTPUT_WRAPPER_FILES) $(SOURCE_API_FILES) $
        mv $@ out/tmp-raw.js
        cat src/shell-pre.js out/tmp-raw.js src/shell-post.js > $@
        rm out/tmp-raw.js
-       # TODO - remove code below in future releases to sunset sql-asm-memory-growth.js
-       printf 'console.error(\n    "\\n\\nDEPRECATION WARNING.\\n"\n    + "sql-asm-memory-growth.js will be removed in future releases.\\n"\n    + "Use sql-asm.js instead, which now includes memory-growth support.\\n\\n"\n);\n' > dist/sql-asm-memory-growth.js
-       cat $@ >> dist/sql-asm-memory-growth.js

 # Web worker API
 .PHONY: worker

if you still want npm-tests on sql-asm-memory-growth.js i can revert this patch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants