Skip to content
This repository was archived by the owner on Jan 24, 2022. It is now read-only.

alexandear/article-similarity

Repository files navigation

Article Similarity

HTTP server to store and search similar articles.

Getting started

Run server and storage containers with Compose:

docker-compose up

API is accessible via http://localhost:80/.

API docs

API's description is in the docs/API file.

Additionally, server serves HTML documentation. Run docker-compose up and visit http://localhost:80/docs.

Similarity algorithm

To find similarity between the content of articles used Levenshtein algorithm for words. Before Levenshtein algorithm is applied content preprocessing:

  • remove articles a, an, the and punctuation .,!?-;
  • content separated to word via whitespace characters \t\n\r;
  • replace all irregular verbs to infinitive; irregular verbs are in the file; assets/irregular_verbs.csv.
  • text is lower-cased.

Algorithm works for English content only.

Scalability

See SCALEME file.

Technologies

There are HTTP server written on Golang and mongodb storage.

Development

Prerequisites: docker, docker-compose, go@1.15, make must be installed.

Code style

Consistent code style enforced by gofmt, EditorConfig tools and golangci-lint linter.

Format code:

make format

Run linter:

make lint

Tests

There are unit and integration tests. Unit tests placed in _test.go files, end-to-end in test directory.

Run unit tests:

make test

End-to-end test suite builds server from sources, runs docker-compose up and perform requests to server container. It can be executed:

make test-it

Docker

Build docker image article-similarity:latest:

make docker

Build, run linter and tests in dev docker image article-similarity-dev:latest:

make docker-dev

CI

There are configured GitHub actions for build, lint, run unit and integration tests. See .github/workflows directory.

About

DevChallenge XVII. Backend Online Round

Topics

Resources

License

Stars

Watchers

Forks