HTTP server to store and search similar articles.
Run server and storage containers with Compose:
docker-compose upAPI is accessible via http://localhost:80/.
API's description is in the docs/API file.
Additionally, server serves HTML documentation. Run docker-compose up and visit http://localhost:80/docs.
To find similarity between the content of articles used Levenshtein algorithm for words. Before Levenshtein algorithm is applied content preprocessing:
- remove articles
a, an, theand punctuation.,!?-; - content separated to word via whitespace characters
\t\n\r; - replace all irregular verbs to infinitive; irregular verbs are in the file; assets/irregular_verbs.csv.
- text is lower-cased.
Algorithm works for English content only.
See SCALEME file.
There are HTTP server written on Golang and mongodb storage.
Prerequisites:
docker,docker-compose,go@1.15,makemust be installed.
Consistent code style enforced by gofmt, EditorConfig tools and golangci-lint linter.
Format code:
make formatRun linter:
make lintThere are unit and integration tests. Unit tests placed in _test.go files,
end-to-end in test directory.
Run unit tests:
make testEnd-to-end test suite builds server from sources, runs docker-compose up and perform requests to server container.
It can be executed:
make test-itBuild docker image article-similarity:latest:
make dockerBuild, run linter and tests in dev docker image article-similarity-dev:latest:
make docker-devThere are configured GitHub actions for build, lint, run unit and integration tests. See .github/workflows directory.