bookshelf

6 Commits 1 Branch 0 Tags

Author	SHA1	Message	Date
Petr Polezhaev	fd32be729f	Replace config-driven HtmlScraperPlugin with specific archive classes Each archive scraper now has its own class with hardcoded URL and parsing logic; config only carries auto_queue, timeout, and rate_limit_seconds. - html_scraper: refactor to base class with public shared utilities (YEAR_RE, AUTHOR_PREFIX_PAT, cls_inner_texts, img_alts) - rusneb.py (new): RusnebPlugin extracts year per list item rather than globally, eliminating wrong page-level dates - alib.py (new): AlibPlugin extracts year from within each <p><b> entry rather than globally, fixing nonsensical year values - shpl.py (new): ShplPlugin retains the dead ШПИЛ endpoint with hardcoded params; config type updated from html_scraper to shpl - config: remove config: subsections from rusneb, alib_web, shpl entries; update type fields to rusneb, alib_web, shpl respectively - plugins/__init__.py: register new specific types, remove html_scraper - tests: use specific plugin classes; assert all CandidateRecord fields (source, title, author, year, isbn, publisher) with appropriate constraints Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-10 00:03:17 +03:00
Petr Polezhaev	b8f82607f9	Fix archive plugins for НЭБ and Alib; add network integration tests - html_scraper: add img_alt strategy (НЭБ titles from <img alt>), bold_text strategy (Alib entries from <p><b>), Windows-1251 encoding support, _cls_inner_texts() helper that strips inner HTML tags - rsl: rewrite to POST SearchFilterForm[search] with CSRF token and CQL title:(words) AND author:(word) query format - config: update rusneb (img_alt + correct author_class) and alib_web (encoding + bold_text) to match fixed plugin strategies - tests: add tests/test_archives.py with network-marked tests for all six archive plugins; НЛР and ШПИЛ marked xfail (endpoints return HTTP 404) - presubmit: exclude network tests from default run (-m "not network") Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-09 22:59:19 +03:00
Petr Polezhaev	ce03046e51	Fix boundary/child count invariant on shelf and book deletion When deleting a shelf or book, remove the corresponding boundary from the parent's boundary list so len(boundaries) == len(children) - 1 is maintained. Add API-level tests covering first, middle, and last child deletion. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-09 19:45:55 +03:00
Petr Polezhaev	7095cbaa60	Add git/docs rules to AGENTS.md; add docs reminder to presubmit	2026-03-09 14:30:16 +03:00
Petr Polezhaev	2ab41ead9f	Update docs; add contributing standards - docs/overview.md: rewrite for current architecture (src/ layout, split JS/CSS modules, credentials/models/functions/ui config categories, correct test fixture targets) - docs/contributing.md: new — documentation philosophy and style guide - AGENTS.md: add rule to follow docs/contributing.md	2026-03-09 14:22:30 +03:00
Petr Polezhaev	084d1aebd5	Initial commit Photo-based book cataloger with AI identification. Room → Cabinet → Shelf → Book hierarchy; FastAPI + SQLite backend; vanilla JS SPA; OpenAI-compatible plugin system for boundary detection, text recognition, and archive search.	2026-03-09 14:17:13 +03:00