diff --git a/HACKING.md b/HACKING.md index f65807e8..1d6d408c 100644 --- a/HACKING.md +++ b/HACKING.md @@ -1,5 +1,4 @@ -Getting started with Debsources development -=========================================== +# Getting started with Debsources development You have 2 documented ways to get a local Debsources environment: either a local deployment directly in your OS, or within a Docker @@ -10,24 +9,23 @@ To test the updater, and subsequently run the webapp on it, you will need a might want to use the data from the Debsources testsuite, which is shipped via a separate Git submodule rooted at testdata/, so: - $ cd debsources/ - $ git submodule update --init +$ cd debsources/ +$ git submodule update --init The testdata Git repository is ~150 MB, so it might take a while to retrieve. -Local Debsources deployment ---------------------------- +## Local Debsources deployment - clone the Debsources Git repository: $ git clone https://salsa.debian.org/qa/debsources.git -or + or $ git clone git@salsa.debian.org:qa/debsources.git - ensure the Python interpreter can find Debsources' Python modules: $ export PYTHONPATH=`pwd`/debsources/lib:"$PYTHONPATH" - $ python -c 'import debsources' # if this fails, double-check $PYTHONPATH + $ python -c 'import debsources' # if this fails, double-check $PYTHONPATH - create a PostgreSQL database for use by Debsources, e.g.: @@ -67,14 +65,14 @@ or - run the webapp: $ bin/debsources-run-app - * Running on http://127.0.0.1:5000/ - * Restarting with reloader + + - Running on http://127.0.0.1:5000/ + - Restarting with reloader you can now visit the above URL with your browser and verify that everything is OK. -Docker container ---------------- +## Docker container - Ensure docker is installed and the service is running, then build the Debsources image (may take a while): @@ -96,18 +94,14 @@ Docker container $ make attach -You're ready for Debsources hacking! How about giving Debsources easy hacks a -go now? - +You're ready for Debsources hacking! How about giving Debsources easy hacks a +go now? -Running tests -------------- +## Running tests See [testing.md](doc/testing.md]. - -Coding conventions -================== +# Coding conventions All new Debsources code should be [PEP8][1] compliant and pass [pyflakes][2] validation. Before submitting patches, please make sure that the lines of code diff --git a/IDEAS.md b/IDEAS.md index 2cf8b3f0..faf6d477 100644 --- a/IDEAS.md +++ b/IDEAS.md @@ -1,10 +1,8 @@ -Ideas for internships, GSoC, Outreach, and friends -================================================== +# Ideas for internships, GSoC, Outreach, and friends (for inspiration, the open bugs are listed at http://deb.li/debsrcbugs) -Debsources on Mobile --------------------- +## Debsources on Mobile Enabling Debsources to work on mobile browsers, via an hybrid (desktop/mobile) design, is an interesting and useful challenge. A @@ -20,11 +18,11 @@ with the browser extension), etc). There is a design challenge involved here, and also a technology choice (e.g. Cordova/PhoneGap vs. real native). -Support of other operating systems ----------------------------------- +## Support of other operating systems Support of security.debian.org, and other operating systems, poses few challenges: + - refactoring (adding a table for the different archives, changing primary keys, lots of UI changes, etc). - support of the updates coming from different archives through @@ -36,20 +34,19 @@ challenges: - through hard links via a cronjob (involving race conditions and similar challenges). -Support of other hashing algorithms ------------------------------------ +## Support of other hashing algorithms In the files table, we currently only compute the sha256 sum. It would be interesting to have other checksums. -Integrated sources editor -------------------------- +## Integrated sources editor Raphael Geissert has developed a Firefox/Chrome plugin to allow the edition of a file directly in Debsources, and to generate a patch ready to be sent to the maintainer of the modified package. See http://rgeissert.blogspot.fr/2015/08/updates-to-sourcesdebiannet-editor.html It would be awesome to: + - integrate it in Debsources' code base, so that users don't require to install the browser extension, - and improve it to support e.g. multi-file editing (that needs diff --git a/README.md b/README.md index bc66a03d..983329bf 100644 --- a/README.md +++ b/README.md @@ -1,14 +1,14 @@ -Development -=========== +# Development -* **source code** is available using [Git][1]: +- **source code** is available using [Git][1]: $ git clone https://salsa.debian.org/qa/debsources.git - or $ git clone git@salsa.debian.org:qa/debsources.git + + or $ git clone git@salsa.debian.org:qa/debsources.git and browsable [on the Web][2]. -* please report **bugs** to the [Debian Bug Tracking System][6] (short URL: +- please report **bugs** to the [Debian Bug Tracking System][6] (short URL: ), against the `qa.debian.org` pseudo-package, using a subject line that begins with "debsources:". @@ -19,10 +19,10 @@ Development (`bin/debsources-reportbug` in the Debsources' Git repo is a convenience script that does the above for you) -* for discussions about Debsources please **contact** the +- for discussions about Debsources please **contact** the [debian-qa-debsources mailing list][4] or the `#debian-debsources` IRC channel on [OFTC][5] -* opportunities for new contributors (AKA **easy hacks**) are [available][7] as +- opportunities for new contributors (AKA **easy hacks**) are [available][7] as well (short URL: ) [1]: http://git-scm.com/ @@ -32,15 +32,11 @@ Development [6]: https://bugs.debian.org/cgi-bin/pkgreport.cgi?pkg=qa.debian.org;tag=debsources [7]: https://bugs.debian.org/cgi-bin/pkgreport.cgi?package=qa.debian.org;include=subject:debsources;tag=newcomer - To get started with Debsources development, have a look at the [HACKING](HACKING.md) file. +# Dependencies -Dependencies -============ - -Webapp ------- +## Webapp Debian packages: @@ -56,9 +52,7 @@ Debian packages: - python-magic - tango-icon-theme - -Infrastructure --------------- +## Infrastructure (work in progress, likely incomplete) @@ -74,8 +68,7 @@ Debian packages: - python-sqlalchemy - sloccount -Other ------ +## Other To re-generate the documentation: diff --git a/contrib/bootstrap/README.md b/contrib/bootstrap/README.md index eda03f0e..7c1ddfd3 100644 --- a/contrib/bootstrap/README.md +++ b/contrib/bootstrap/README.md @@ -1,13 +1,11 @@ -Generating Bootstrap -==================== +# Generating Bootstrap We use bootstrap's customizer to ensure our bootstrap is the small and also stays with the aesthetics of debian. The file `config.json` contains related configuration. -How To Generate ------------------ +## How To Generate - Go to https://getbootstrap.com/customize/ @@ -15,6 +13,7 @@ How To Generate `/contrib/bootstrap` to match. - Download the `bootstrap.zip` file + ```sh BASE=/path/to/debsouces/repo # make sure this points to top level unzip ~/Downloads/bootstrap.zip -d /tmp # Download location may vary diff --git a/doc/archiving-a-suite.md b/doc/archiving-a-suite.md index 66f0d895..b0668e3d 100644 --- a/doc/archiving-a-suite.md +++ b/doc/archiving-a-suite.md @@ -1,28 +1,28 @@ # Archiving a suite -How to mark a suite as sticky (ensuring its packages remain around) *before* it +How to mark a suite as sticky (ensuring its packages remain around) _before_ it gets removed from the mirror network. -1) edit `lib/debsources/consts.py`, setting `archived: True` on the relevant suite +1. edit `lib/debsources/consts.py`, setting `archived: True` on the relevant suite (if it exists there otherwise, e.g., `*-lts` variants, don't bother) -2) on the DB: set to `t` the column `sticky` of the relevant suite, e.g.: +2. on the DB: set to `t` the column `sticky` of the relevant suite, e.g.: - ```sql - update suites_info set sticky = 't' where name = 'squeeze'; - ``` + ```sql + update suites_info set sticky = 't' where name = 'squeeze'; + ``` -3) archive the suite using the archiver `add` action, e.g.: +3. archive the suite using the archiver `add` action, e.g.: - ```shell - bin/debsources-suite-archive add squeeze - ``` + ```shell + bin/debsources-suite-archive add squeeze + ``` - or, more precisely on sources.d.o machine: + or, more precisely on sources.d.o machine: - ```shell - sudo -u debsources PYTHONPATH=./lib bin/debsources-suite-archive add squeeze -vvv --single-transaction no - ``` + ```shell + sudo -u debsources PYTHONPATH=./lib bin/debsources-suite-archive add squeeze -vvv --single-transaction no + ``` -4) run `bin/debsources-suite-archive list` and check that the given suite is +4. run `bin/debsources-suite-archive list` and check that the given suite is marked as both available and indexed, i.e., `True` on both columns diff --git a/doc/bugs.debian.org-usertag.md b/doc/bugs.debian.org-usertag.md index d6ad8a53..d1491719 100644 --- a/doc/bugs.debian.org-usertag.md +++ b/doc/bugs.debian.org-usertag.md @@ -1,8 +1,8 @@ A list of Debian bugs related, for various reasons, to Debsources and/or sources.debian.org is available at: - https://bugs.debian.org/cgi-bin/pkgreport.cgi?tag=debsources;users=zack@debian.org +https://bugs.debian.org/cgi-bin/pkgreport.cgi?tag=debsources;users=zack@debian.org To add a bug to that list (please ask qa-debsources@lists.alioth.debian.org first): - bts user zack@debian.org , usertag XXXXXX debsources +bts user zack@debian.org , usertag XXXXXX debsources diff --git a/doc/celery.md b/doc/celery.md index 25b9b3d7..786b243b 100644 --- a/doc/celery.md +++ b/doc/celery.md @@ -1,8 +1,6 @@ -Research -======== +# Research -Persistence ------------ +## Persistence With the AQMP backend, tasks are persistent by default (messages are saved both in-memory and on disk). @@ -14,21 +12,19 @@ run again. It is possible to enable 'late acks', meaning tasks will be aknowledged only after a successful execution. Tasks need to be idempotent for that to work correctly. - - http://celery.readthedocs.org/en/latest/faq.html#faq-acks-late-vs-retry +- http://celery.readthedocs.org/en/latest/faq.html#faq-acks-late-vs-retry - - http://celery.readthedocs.org/en/latest/configuration.html#celery-acks-late +- http://celery.readthedocs.org/en/latest/configuration.html#celery-acks-late - - -Transactional tasks -------------------- +## Transactional tasks Celery does not support transactional task queues. Several libraries try to add the feature: pyramid_transactional_celery -~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +``` -> https://pypi.python.org/pypi/pyramid_transactional_celery @@ -45,10 +41,10 @@ django-transaction-barrier -> https://libraries.io/pypi/django-transaction-barrier -For django. +For django. -Unit tests +Unit tests ---------- For unit testing, tasks can easily be made synchronous. @@ -78,10 +74,10 @@ This mean tasks running the hooks can send the results of the plugins to a callback which will insert those results into the database. This solution allows: - + - running the hooks on machines that don't have access to the database - + - not importing a package if one of the hooks failed However, it increases the network overhead. In particular, the ctags @@ -128,14 +124,14 @@ When a node does not have edges, it means the resource is needed by all tasks in the cluster. Running tasks near the resources -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +``` The simplest way to run tasks on machine with access to the needed resources is to create several queues, one for each resource: - - mirror +- mirror - - database +- database For example, several workers can listen to the 'mirror' queue, and all tasks routed to that queue will run only on those workers. @@ -146,12 +142,9 @@ thus there is no single repository of sources. In this case, we can dynamically find workers running on the same machine as the "add_package" task, and direct the hooks tasks on those workers. +# Implementation details -Implementation details -====================== - -sqlalchemy session ------------------- +## sqlalchemy session The session is setup in the DBTask class, which will be used as the base class for all celery tasks that need to access the database. That @@ -180,11 +173,9 @@ In unit tests, we must close the session of all task classes, for example: add_package.session.close() add_package.engine.dispose() - [1] http://celery.readthedocs.org/en/latest/userguide/tasks.html#instantiation -plugin tasks ------------- +## plugin tasks Celery has a way to define tasks that don't depend on a celery application: use the `celery.shared_task decorator` instead of @@ -204,48 +195,36 @@ started. debsources_conf['observers'], debsources_conf['file_exts'] = \ mainlib.load_hooks(debsources_conf) - - - [1] http://docs.celeryproject.org/en/latest/django/first-steps-with-django.html#using-the-shared-task-decorator [2] http://celery.readthedocs.org/en/latest/userguide/signals.html#celeryd-init +# Celery configuration -Celery configuration -==================== - -Result backend --------------- +## Result backend To use chords, we need to keep a result backend for keeping the results of tasks and passing them to other tasks. Result backends are disabled by default, and several choices are available[1]: - - - sqlalchemy - - memcached - - redis - - rabbitmq - - ... + +- sqlalchemy +- memcached +- redis +- rabbitmq +- ... [1] http://celery.readthedocs.org/en/latest/configuration.html#task-result-backend-settings -Dependencies -============ - - - python-celery - - rabbitmq +# Dependencies +- python-celery +- rabbitmq -Running -======= +# Running -Worker ------- +## Worker bin/debsources-async-celery worker - -Updater -------- +## Updater bin/debsources-async-update diff --git a/doc/copyright.debian.net-api.md b/doc/copyright.debian.net-api.md index 18064f78..38de3680 100644 --- a/doc/copyright.debian.net-api.md +++ b/doc/copyright.debian.net-api.md @@ -1,144 +1,136 @@ - -File-by-file API -================ +# File-by-file API The file-by-file API allows the user to search the license of file providing: -* a checksum; or: -* a package, version, path - +- a checksum; or: +- a package, version, path -URL schema ----------- +## URL schema -* /copyright/api/sha256/?checksum= +- /copyright/api/sha256/?checksum= - * Optional parameter package name: &package= - * Optional parameter suite (suite alias or latest): &suite= + - Optional parameter package name: &package= + - Optional parameter suite (suite alias or latest): &suite= -* /copyright/api/file/PACKAGE/VERSION/PATH/ - * Version can be a suite alias (jessie), a package version or the keywords - all and latest - * Package is not optional. Otherwise finding the file would consume a lot of - time. Possible workaround #761108 +- /copyright/api/file/PACKAGE/VERSION/PATH/ + - Version can be a suite alias (jessie), a package version or the keywords all and + latest + - Package is not optional. Otherwise finding the file would consume a lot of time. + Possible workaround #761108 -JSON structure ---------------- +## JSON structure -API should be homogeneous, meaning that the result provided in case of a -checksum or a (path, package, version) should have the same structure. This is -interesting for the end user as s.he will have to parse only one type of -results. +API should be homogeneous, meaning that the result provided in case of a checksum or a +(path, package, version) should have the same structure. This is interesting for the end +user as s.he will have to parse only one type of results. -Emitting license (and copyright) statements is the responsibility of external -"license oracles". Debsources only makes those statements available via this -API. Several license oracles are supported; the main one is "debian", which -gives access to the license information available in `debian/copyright` files. -Other license oracles will be supported in the future (see relevant section -below). The API will label each license statement with the name of the emitting -oracle. +Emitting license (and copyright) statements is the responsibility of external "license +oracles". Debsources only makes those statements available via this API. Several license +oracles are supported; the main one is "debian", which gives access to the license +information available in `debian/copyright` files. Other license oracles will be +supported in the future (see relevant section below). The API will label each license +statement with the name of the emitting oracle. -It is also possible that a search by checksum returns more than one result -since the exact same file might appear in different packages, or different -versions of the same package. It is possible that the license statements -associated to the different occurrences of a given file will differ from one -another. These inconsistencies will not be hidden by the API, but rather -returned with information about its context. +It is also possible that a search by checksum returns more than one result since the +exact same file might appear in different packages, or different versions of the same +package. It is possible that the license statements associated to the different +occurrences of a given file will differ from one another. These inconsistencies will not +be hidden by the API, but rather returned with information about its context. -To address these requirements the JSON structure should have the following -form: +To address these requirements the JSON structure should have the following form: +```json { - results: [ + "results": [ + { + "sha256": "----", + "copyright": [ { - sha256: "----", - copyright: [ - { - path: "----", - package: "pkg2", - version: "v2", - license: "GPL2", - origin: "----", - oracle: "debian", - }, - { - path: "----", - package: "pkg2", - version: "v3", - license: "GPL3", - origin: "----", - oracle: "debian" - }, - ] + "path": "----", + "package": "pkg2", + "version": "v2", + "license": "GPL2", + "origin": "----", + "oracle": "debian" }, { - sha256: "----", - copyright: [ - { - path: "----", - package: "pkg1", - version: "v1", - license: "GPL3", - origin: "----", - oracle: "debian" - }, - { - path: "----", - package: "pkg1", - version: "v1", - license: "GPL2", - origin: "----", - oracle: "ninka" - }, - ] + "path": "----", + "package": "pkg2", + "version": "v3", + "license": "GPL3", + "origin": "----", + "oracle": "debian" + } + ] + }, + { + "sha256": "----", + "copyright": [ + { + "path": "----", + "package": "pkg1", + "version": "v1", + "license": "GPL3", + "origin": "----", + "oracle": "debian" }, - ] + { + "path": "----", + "package": "pkg1", + "version": "v1", + "license": "GPL2", + "origin": "----", + "oracle": "ninka" + } + ] + } + ] } +``` -The copyright dictionary contains a list with all the appearances of the -specific checksum, providing their package, version, path info along with the -license synopsis, its origin (e.g., link to the `debian/copyright` file) and -the license oracle used to retrieve the license. +The copyright dictionary contains a list with all the appearances of the specific +checksum, providing their package, version, path info along with the license synopsis, +its origin (e.g., link to the `debian/copyright` file) and the license oracle used to +retrieve the license. -In case the user searches using a file name, package, version then the -copyright list will contain a single result for each license oracle used. +In case the user searches using a file name, package, version then the copyright list +will contain a single result for each license oracle used. -The results field allows grouping by checksum. This is necessary for -searching with file name / package with 'all' as version since each file might -have different checksums because of the changes between the versions. +The results field allows grouping by checksum. This is necessary for searching with file +name / package with 'all' as version since each file might have different checksums +because of the changes between the versions. -Batch API -================ +# Batch API -This API allows the user to search the license of many files (batch) at once -providing their checksum +This API allows the user to search the license of many files (batch) at once providing +their checksum -URL schema ----------- +## URL schema -* /copyright/api/sha256/ +- /copyright/api/sha256/ -The API accepts an HTTP POST request. The data must be form-encoded, repeating -the checksum parameter for multiple values. -For example, if you are using python requests to create the POST request then -the dictionnary containing the values should have the following structure: +The API accepts an HTTP POST request. The data must be form-encoded, repeating the +checksum parameter for multiple values. For example, if you are using python requests to +create the POST request then the dictionnary containing the values should have the +following structure: +```json { "checksums": [SUM1, SUM2, SUM3, ...], "package": PACKAGE, "suite": SUITE } +``` -Checksums is a list of sha256 checksums. -The package and suite parameters are optional. Suite can be a suite alias or -latest. +Checksums is a list of sha256 checksums. The package and suite parameters are optional. +Suite can be a suite alias or latest. -JSON structure ---------------- +## JSON structure -The results are homogeneous, independently on whether the file-by-file API or -the batch one has been invoked. +The results are homogeneous, independently on whether the file-by-file API or the batch +one has been invoked. +```json { result: [ { @@ -185,37 +177,32 @@ the batch one has been invoked. }, ] } +``` The JSON structure is identical to the file-by-file one with the results field -containing the list of checksums provided by the user along with the relevant -copyright dictionnary. +containing the list of checksums provided by the user along with the relevant copyright +dictionnary. -Supported license oracles -================ +# Supported license oracles -* **debian**: corresponding to information retrieved from `debian/copyright` - files. +- **debian**: corresponding to information retrieved from `debian/copyright` files. - If `debian/copyright` is [machine-readable][1] for the requested file, the - license field returned by this API will contain the value of the relevant - "License:" field in `debian/copyright` + If `debian/copyright` is [machine-readable][1] for the requested file, the license + field returned by this API will contain the value of the relevant "License:" field in + `debian/copyright` - If `debian/copyright` is not machine-readable, the license field will be - null. + If `debian/copyright` is not machine-readable, the license field will be null. [1]: https://www.debian.org/doc/packaging-manuals/copyright-format/1.0/ -Other license oracles might be added in the future. In particular we are -considering adding support for FOSSology and Ninka. The API is extensible -enough to provide the user with the those results using the oracle field. Thus -the available information mined by the oracles will be added using another -dictionary field, specific to the oracle in question, in the copyright field. - +Other license oracles might be added in the future. In particular we are considering +adding support for FOSSology and Ninka. The API is extensible enough to provide the user +with the those results using the oracle field. Thus the available information mined by +the oracles will be added using another dictionary field, specific to the oracle in +question, in the copyright field. -Filters -================ +# Filters The URL schemes could be extended with other optional parameters such as: -* &oracle= return only license statements emitted by the requested license - oracle +- &oracle= return only license statements emitted by the requested license oracle diff --git a/doc/copyright.debian.net-spec.md b/doc/copyright.debian.net-spec.md index bcc728bf..ee5a8537 100644 --- a/doc/copyright.debian.net-spec.md +++ b/doc/copyright.debian.net-spec.md @@ -1,5 +1,4 @@ -copyright.debian.net Specification - Version 0.3 -================================================ +# copyright.debian.net Specification - Version 0.3 The goal is to develop a web application that allow to browse, search, and publish on the web license and copyright information, as contained in Debian's @@ -10,38 +9,37 @@ files, as described in the relevant [standard][1]. [1]: https://www.debian.org/doc/packaging-manuals/copyright-format/1.0/ +## Functional requirements -Functional requirements ------------------------ +- **location**: the web app will be located at -* **location**: the web app will be located at - -* it should allow **browsing/searching** through packages/versions as currently +- it should allow **browsing/searching** through packages/versions as currently possible using : the user should be able to browse by package prefix, or search packages by name -* once a fully-specified package/version is reached in the navigation, the web +- once a fully-specified package/version is reached in the navigation, the web app should return the content of the corresponding **debian/copyright** file: - * if the file does /not/ conform to the machine-readable format, then only a + - if the file does /not/ conform to the machine-readable format, then only a textual dump of the content of the file will be shown (this is the least-interesting/useful case) - * if the file does conform to the machine-readable format, then a proper - **rendering of the copyright file** will be given, exploiting the (parsable) - file-structure as much as possible, e.g.: + - if the file does conform to the machine-readable format, then a proper + **rendering of the copyright file** will be given, exploiting the (parsable) + file-structure as much as possible, e.g.: + + - folding license paragraphs + - adding hyperlinks to the relevant files within - * folding license paragraphs - * adding hyperlinks to the relevant files within - * adding hyperlinks to the authoritative license URLs, etc. + - adding hyperlinks to the authoritative license URLs, etc. -* when visiting a specific `debian/copyright` file, it should be possible to +- when visiting a specific `debian/copyright` file, it should be possible to convert it to [**SPDX**][2], offering the ability to download the generated XML file [2]: https://spdx.org/ -* in addition to package-based navigation, the web app will allow to **search +- in addition to package-based navigation, the web app will allow to **search by file** the set of available debian/copyright files. In the beginning (due to the lack of other file-based searches in Debsources), file-based search will be limited to SHA256-based search, as currently available for Debsources @@ -57,17 +55,17 @@ Functional requirements file (according to the semantics of "Files:" globs in the corresponding `debian/copyright` files). - Note that there might be inconsistencies in the obtained results: one - package could claim that a given file is, say, licensed under the GPL3 and - copyright Joe R. Developer, while another that the very same file is - licensed under LGPL2 and copyright John Doe. Such inconsistencies should - *not* be hidden to the user, but rather shown and highlighted (e.g., with - presentations like "according to package1/version2 license is..., - copyright is; ... according to package3/version4 ...."). + Note that there might be inconsistencies in the obtained results: one + package could claim that a given file is, say, licensed under the GPL3 and + copyright Joe R. Developer, while another that the very same file is + licensed under LGPL2 and copyright John Doe. Such inconsistencies should + _not_ be hidden to the user, but rather shown and highlighted (e.g., with + presentations like "according to package1/version2 license is..., + copyright is; ... according to package3/version4 ...."). -* extending on the above search-by-file feature, it should be possible to +- extending on the above search-by-file feature, it should be possible to request the generation of a **Bill of Material (BoM)**. The query API should - be extended to allow submitting *a list of SHA256* files as a single + be extended to allow submitting _a list of SHA256_ files as a single request. For long lists, the submission of a compressed version of such a list should be supported, for efficiency reasons. The web app should return a BoM, consisting of a list of license/copyright statements that match those @@ -76,35 +74,33 @@ Functional requirements best-/first-/random-match policy. It should be possible to request the BoM in various formats, including: - * a simple, human readable, text format (e.g. JSON, as already supported by + - a simple, human readable, text format (e.g. JSON, as already supported by Debsources API calls) - * [SPDX][2] + - [SPDX][2] Ideally, a compatible **client-side scanner** tool will be available, to scan a given software project available locally, contact copyright.d.n, and return a BoM. (Writing the scanner is outside the scope of the present specification.) -* **statistics**: we should collect/publish statistics about licenses, in the +- **statistics**: we should collect/publish statistics about licenses, in the same spirit of the current page +## Non-functional requirements -Non-functional requirements ---------------------------- - -* the web app will be a **front-end only**. Its back-end will be the same of +- the web app will be a **front-end only**. Its back-end will be the same of Debsources. In particular, the web app will use the same database and the same ORM abstractions on top of it. No specific updated will be deployed either. The app will use (read-only) the data maintained up-to-date by Debsources. -* **deployment modularity**: the web app should be deployable both on the +- **deployment modularity**: the web app should be deployable both on the separate website and within Debsources, e.g., starting from a URL like . To this end, the core functionalities will be included in a Flask blueprint and the blueprints (copyright & sources) will share the navigation code. -* **debian/copyright rendering module**: the rendering of a machine-readable +- **debian/copyright rendering module**: the rendering of a machine-readable `debian/copyright` should be a separate "debian/copyright renderer" component, reusable in a form of a python module. The module should consume a Copyright object from python-debian and export it in a jinja template. @@ -112,14 +108,14 @@ Non-functional requirements such as HTML and anticipate other uses besides Debsources i.e integrate it upstream. -* **infobox copyright link**: from it should be +- **infobox copyright link**: from it should be possible, after reaching a fully determined package/version, to follow a link "copyright" from the package infobox; that link will lead to the appropriate `debian/copyright` rendering page under (or, equivalently, under ) -* **sources of copyright & license information**: for the time being, +- **sources of copyright & license information**: for the time being, machine-readable `debian/copyright` files are the only source of copyright/license information that we can exploit. In the future we might have more, e.g., we might run tools like [FOSSology][3] or [Ninka][4] to @@ -131,7 +127,7 @@ Non-functional requirements [3]: http://www.fossology.org/projects/fossology [4]: http://ninka.turingmachine.org/ -* **on-the-fly vs batch parsing of debian/copyright:** for the time being, we +- **on-the-fly vs batch parsing of debian/copyright:** for the time being, we can parse on the fly `debian/copyright` files when the web app needs them. But that is sub-optimal, slow, and potentially dangerous (especially when the user asks for the copyright of a file whose SHA256 is very popular: @@ -139,17 +135,16 @@ Non-functional requirements leading to DoS scenarios). As soon as the celery migration is completed, a plugin should be developped to parse copyright information at update time. The result of the plugin will be storing into the DB (in a new table) - copyright/license statements; the web app will simply have to read those data. + copyright/license statements; the web app will simply have to read those data. We should keep in mind this extension, to ease future migrations. -* **debian/copyright parser**: to parse machine-readable `debian/copyright` +- **debian/copyright parser**: to parse machine-readable `debian/copyright` files, we will use the `debian.copyright` module shipped by the [python-debian][5]. An example of how to use the module is available at [5]: https://tracker.debian.org/pkg/python-debian - diff --git a/doc/exclude.md b/doc/exclude.md index ea795397..e7077f73 100644 --- a/doc/exclude.md +++ b/doc/exclude.md @@ -1,12 +1,10 @@ Debsources has the ability to exclude parts of the available content from processing. Exclusions come in 2 flavors: -* file-based exclusions -* package exclusions +- file-based exclusions +- package exclusions - -File-based exclusions -===================== +# File-based exclusions A file called `LOCAL/exclude.conf`, where `LOCAL` is Debsources local directory (`local_dir` configuration entry), can be used to exclude specific files from @@ -18,10 +16,10 @@ processed by any plugin. The file is a Deb822-like file made of several stanzas (or "paragraphs"), separated by empty lines. The general stanza format is as follows: - Explanation: (optional) some commentary explaining the exclusion - Package: affected source package - Files: UNIX-style glob of files to exclude - Action: remove + Explanation: (optional) some commentary explaining the exclusion + Package: affected source package + Files: UNIX-style glob of files to exclude + Action: remove `Files` field is space separated. All fields are "folded", i.e. they can be broken into several physical lines, indenting subsequent lines by one space. @@ -31,7 +29,7 @@ broken into several physical lines, indenting subsequent lines by one space. - initially, all files of (any version of) `Package` are eligible for exclusion - eligible files are filtered using `Files`: only files that match at least one - of its glob patterns are retained. Patterns are matched relatively to + of its glob patterns are retained. Patterns are matched relatively to package root directories, AKA their extraction directories After the evaluation of the above fields, all files eligible for exclusion get @@ -42,46 +40,40 @@ Supported actions, i.e., valid fields for `Action` are: - `remove`: remove excluded files from both the package extraction directory and Debsources DB. - -Maintenance ------------ +## Maintenance Note that changes to `exclude.conf` do not trigger re-extraction of a package. If you change `exclude.conf` to exclude a file, you will need to remove the package from debsources and update again to have the exclusion take effect; similarly if you drop a previously enacted exclusion. +## Examples -Examples --------- - - Explanation: #742605 - Package: chromium-browser - Files: foo/bar.c - Action: remove - - Explanation: non free, non redistributable, see #XXXXXX - Package: bad-bad-package - Files: baz/qux/*/*.c - Action: remove + Explanation: #742605 + Package: chromium-browser + Files: foo/bar.c + Action: remove + Explanation: non free, non redistributable, see #XXXXXX + Package: bad-bad-package + Files: baz/qux/*/*.c + Action: remove -Package exclusions -================== +# Package exclusions The configuration file is shared with file-based exclusions (`LOCAL/exclude.conf`), and will therefore contain entries for both file-based -and package exclusions. To decide which is which, the following rule applies. +and package exclusions. To decide which is which, the following rule applies. A stanza that contains a `Files:` field is a file-based exclusion stanza. A -stanza that does *not* contain a `Files:` (and contains a `Package:` field) is +stanza that does _not_ contain a `Files:` (and contains a `Package:` field) is a package exclusion stanza. The general stanza format for package exclusions is as follows: - Explanation: (optional) some commentary explaining the exclusion - Package: affected source package - Version: (optional) affected package version - Action: ignore + Explanation: (optional) some commentary explaining the exclusion + Package: affected source package + Version: (optional) affected package version + Action: ignore The fields `Explanation`, `Package`, and `Action` are as per file-based exclusions (although `Action` supports different values, see below). @@ -98,5 +90,5 @@ version matches the value of that field. Supported actions, i.e., valid fields for `Action` are: - `ignore`: ignore the package when updating Debsources. At present, this - specifically means only not *adding* it to the data storage, but not + specifically means only not _adding_ it to the data storage, but not necessarily removing it if it has been added in the past. diff --git a/doc/extract_all.bench.md b/doc/extract_all.bench.md index d5200573..7a22d787 100644 --- a/doc/extract_all.bench.md +++ b/doc/extract_all.bench.md @@ -1,24 +1,27 @@ # first extraction -zack@tytso:/srv/source.debian.org$ time bin/extract_all -real 111m46.898s -user 74m26.915s -sys 19m7.196s + +zack@tytso:/srv/source.debian.org$ time bin/extract_all +real 111m46.898s +user 74m26.915s +sys 19m7.196s # one (random) update after mirror pulse -zack@tytso:/srv/source.debian.org$ time bin/extract_all -real 3m52.108s -user 0m51.531s -sys 0m29.690s + +zack@tytso:/srv/source.debian.org$ time bin/extract_all +real 3m52.108s +user 0m51.531s +sys 0m29.690s # a do-nothing update -zack@tytso:/srv/source.debian.org$ time bin/extract_all -real 2m55.179s -user 0m10.025s -sys 0m15.985s + +zack@tytso:/srv/source.debian.org$ time bin/extract_all +real 2m55.179s +user 0m10.025s +sys 0m15.985s # clean up bench -zack@tytso:/srv/sources.debian.org$ time rm -rf sources/ -real 3m7.814s -user 0m5.996s -sys 2m27.109s +zack@tytso:/srv/sources.debian.org$ time rm -rf sources/ +real 3m7.814s +user 0m5.996s +sys 2m27.109s diff --git a/doc/local-info.md b/doc/local-info.md index 8b8afb2c..d9915a7a 100644 --- a/doc/local-info.md +++ b/doc/local-info.md @@ -1,17 +1,16 @@ -Local Information -================= +# Local Information You can customize a Debsources instance to publish on the web local -information, such as news, instance-specific information, etc. To do so you -should create files containing HTML snippets in the *local directory* of your +information, such as news, instance-specific information, etc. To do so you +should create files containing HTML snippets in the _local directory_ of your Debsources instance, which by default is located at `$DEBSOURCES_ROOT/local/`. You can customize the path of your local directory in `config.ini` as follows: - [DEFAULT] - # ... - local_dir: /some/where/else # defaults to: %(root_dir)s/local - # ... + [DEFAULT] + # ... + local_dir: /some/where/else # defaults to: %(root_dir)s/local + # ... At present, you can add the following kind of local information to your Debsources instance: diff --git a/doc/maintenance.md b/doc/maintenance.md index 6caf4038..371c3c75 100644 --- a/doc/maintenance.md +++ b/doc/maintenance.md @@ -1,5 +1,4 @@ -Recreate the DB -=============== +# Recreate the DB To recreate the DB using the current state of Debsources file storage as reference: @@ -7,15 +6,13 @@ reference: 1. reset the current DB by either emptying all tables or simply recreating the DB, e.g. with: - $ bin/debsources-dbadmin --dropdb --createdb postgresql:///debsources + $ bin/debsources-dbadmin --dropdb --createdb postgresql:///debsources 2. refill DB backend(s), skipping filesystem: - $ bin/debsources-update --backend db --backend hooks --backend hooks.db + $ bin/debsources-update --backend db --backend hooks --backend hooks.db - -Add/remove plugins -================== +# Add/remove plugins To add a plugin: @@ -23,8 +20,8 @@ To add a plugin: 2. trigger the `add-package` event: - $ bin/debsources-update -vvv --backend hooks.fs --backend hooks.db \ - --trigger add-package/NAME + $ bin/debsources-update -vvv --backend hooks.fs --backend hooks.db \ + --trigger add-package/NAME 3. add the hook to the `hooks` configuration entry in config(.local).ini diff --git a/doc/mirror-test.md b/doc/mirror-test.org similarity index 100% rename from doc/mirror-test.md rename to doc/mirror-test.org diff --git a/doc/new-release.md b/doc/new-release.md index bdf350ef..0eb5ce72 100644 --- a/doc/new-release.md +++ b/doc/new-release.md @@ -1,5 +1,5 @@ # When a new Debian release is out -* update lib/debsources/consts.txt +- update lib/debsources/consts.txt Remember to also add oldrelease-backports-sloppy if relevant. -* archive oldoldstable (see doc/archiving-a-suite.txt) +- archive oldoldstable (see doc/archiving-a-suite.txt) diff --git a/doc/patch-tracker.md b/doc/patch-tracker.md index 07a2576d..104235c7 100644 --- a/doc/patch-tracker.md +++ b/doc/patch-tracker.md @@ -1,238 +1,246 @@ -Old patch tracker -================ +# Old patch tracker -Browsing ----------- +## Browsing -* Package prefix -* Package -* Package / version +- Package prefix +- Package +- Package / version -Views ----------- +## Views -* View package summary (version, checksum, diff files, - debian or upstream patches) -* View patch -* Syntax highlight using pygments -* Download patch +- View package summary (version, checksum, diff files, debian or upstream patches) +- View patch +- Syntax highlight using pygments +- Download patch +## DB - caching -DB - caching ----------- +- Storing package info (maintainer, uploader, name version, diff size, checksum, deb tar + size checksum..), +- Caching objects on disk (filterdiffs, diffgz) to reuse later -* Storing package info (maintainer, uploader, name version, diff size, - checksum, deb tar size checksum..), -* Caching objects on disk (filterdiffs, diffgz) to reuse later +## Patch formats -Patch formats ----------- +- Quilt series +- Dpatch +- Cdbs -* Quilt series -* Dpatch -* Cdbs +## Misc -Misc ----------- +- Export to UDD -* Export to UDD +## TODO - Suggestions -TODO - Suggestions ----------- +- The diffstat should link to anchors embedded in the diff for each file. +- Diff between orig.tar.gz for 3.0 +- Different colors for co-maintained packages +- Extract comments from patch series to add in summary +- When viewing package list/versions or packages per maintainer it is usefull to mention + the packages that do not have patches. (so people don't click for nothing) +- links to BTS for the closed bugs +- cross-distro solution? -* The diffstat should link to anchors embedded in the diff for each file. -* Diff between orig.tar.gz for 3.0 -* Different colors for co-maintained packages -* Extract comments from patch series to add in summary -* When viewing package list/versions or packages per maintainer it is - usefull to mention the packages that do not have patches. - (so people don't click for nothing) -* links to BTS for the closed bugs -* cross-distro solution? +# Requirement Analysis -Requirement Analysis -================ +## Target users -Target users ----------- +- Debian developers / maintainers / contributors -* Debian developers / maintainers / contributors +- Upstream -* Upstream +- 3rd party distributions -* 3rd party distributions - -Use stories ----------- +## Use stories [general - can be any of the 3 target users] -* As a patch tracker user I want to be able to browse by the prefix of packages -so that I can find specific packages. +- As a patch tracker user I want to be able to browse by the prefix of packages so that + I can find specific packages. - Acceptance criteria: - view list of package prefixes - - click on package prefix to find list of packages - under that prefix + Acceptance criteria: - view list of package prefixes + - click on package prefix to find list of packages + under that prefix -* As a patch tracker user I want to be able to search for a package to find its -patches. - - Acceptance criteria: - search form to input a package name - - get exact matches and other results +- As a patch tracker user I want to be able to search for a package to find its patches. -* As a patch tracker user I want to be able to browse between different versions -of each package to view patches in a specific version of a package. + Acceptance criteria: - search form to input a package name + - get exact matches and other results - Acceptance criteria: - view package versions inside a package page - - click on package version to find list of patches - under that version +- As a patch tracker user I want to be able to browse between different versions of each + package to view patches in a specific version of a package. -* As a patch tracker user I want to be able to view a patch with highlighted -syntax to track changed files and code. + Acceptance criteria: - view package versions inside a package page + - click on package version to find list of patches + under that version - Acceptance criteria: - view patch in highlighted syntax +- As a patch tracker user I want to be able to view a patch with highlighted syntax to + track changed files and code. -* As a patch tracker user I want to be able to download a patch in order to -apply it locally. + Acceptance criteria: - view patch in highlighted syntax - Acceptance criteria: - raw download of the patch +- As a patch tracker user I want to be able to download a patch in order to apply it + locally. -* As a patch tracker user I want to be able to view the discription (if it -exists) of the patches in the list of a patches in order to identify the one I -am interested in. + Acceptance criteria: - raw download of the patch - Acceptance criteria: - list description in the patches summary +- As a patch tracker user I want to be able to view the discription (if it exists) of + the patches in the list of a patches in order to identify the one I am interested in. -* As a patch tracker user I want to be able to view the summary of a patch -(files changed) in the list of patches in order to identify the one I am -interested in. + Acceptance criteria: - list description in the patches summary - Acceptance criteria: - list changed files in the patches summary +- As a patch tracker user I want to be able to view the summary of a patch (files + changed) in the list of patches in order to identify the one I am interested in. -* As a patch tracker user I want links pointing to Debsources from the modified -files mentioned in the description to view the source code. + Acceptance criteria: - list changed files in the patches summary - Acceptance criteria: - view links to Debsources from the changed files in - the patches summary +- As a patch tracker user I want links pointing to Debsources from the modified files + mentioned in the description to view the source code. -* As a patch tracker user I want links pointing to the bug tracker for the bugs -mentioned by the patches to view the origin of the bug. + Acceptance criteria: - view links to Debsources from the changed files in + the patches summary - Acceptance criteria: - view link to BTS for the bugs mentioned in a patch +- As a patch tracker user I want links pointing to the bug tracker for the bugs + mentioned by the patches to view the origin of the bug. + Acceptance criteria: - view link to BTS for the bugs mentioned in a patch [Debian roles] -* As a Debian developer I want to view a patch applied in a package in order to -understand how a bug was resolved. - - Acceptance criteria: - view code and changes of a patch with highlighted - syntax +- As a Debian developer I want to view a patch applied in a package in order to + understand how a bug was resolved. -* As a Debian maintainer I want to download a patch applied in a package in -order to solve a bug present in a package I maintain. + Acceptance criteria: - view code and changes of a patch with highlighted + syntax - Acceptance criteria: - raw download of the patch +- As a Debian maintainer I want to download a patch applied in a package in order to + solve a bug present in a package I maintain. + Acceptance criteria: - raw download of the patch [Upstream] -* As an upstream author I want to be able to track changes between the -orig.tar.gz and the package in Debian. +- As an upstream author I want to be able to track changes between the orig.tar.gz and + the package in Debian. - Acceptance criteria: - view diff of orig.tar.gz and Debian package - - download diff of orig.tar.gz and Debian package + Acceptance criteria: - view diff of orig.tar.gz and Debian package + - download diff of orig.tar.gz and Debian package -* As an upstream author I want to be able to view the checksum of the -orig.tar.gz used in Debian so that I find out if there are any changes between -the released software and the one shipped by Debian. +- As an upstream author I want to be able to view the checksum of the orig.tar.gz used + in Debian so that I find out if there are any changes between the released software + and the one shipped by Debian. - Acceptance criteria: - view checksum of the orig.tar.gz + Acceptance criteria: - view checksum of the orig.tar.gz -* As an upstream author I want to be able to view a summary of changes (files -modified, number of lines etc) between orig.tar.gz and the package in Debian. +- As an upstream author I want to be able to view a summary of changes (files modified, + number of lines etc) between orig.tar.gz and the package in Debian. - Acceptance criteria: - view summary of all patches together + Acceptance criteria: - view summary of all patches together -* As an upstream author I want to be able to download patches that solved bugs -in Debian that are still present in my release. - - Acceptance criteria: - raw download of patch +- As an upstream author I want to be able to download patches that solved bugs in Debian + that are still present in my release. + Acceptance criteria: - raw download of patch [3rd party distributions] -* As a contributor in another distribution I want to be able to view patches -applied in a package in Debian to fix bugs in the distribution I contribute. +- As a contributor in another distribution I want to be able to view patches applied in + a package in Debian to fix bugs in the distribution I contribute. + + Acceptance criteria: - view patch in highlighted syntax - Acceptance criteria: - view patch in highlighted syntax +- As a contributor in another distribution I want to be able to download patches applied + in a package in Debian to apply it locally in my package. -* As a contributor in another distribution I want to be able to download -patches applied in a package in Debian to apply it locally in my package. + Acceptance criteria: - raw download of patch - Acceptance criteria: - raw download of patch +## Use cases -Use cases ----------- +### Id: Case#1 -Id: Case#1 Name: Navigate in the patch tracker + Actors: Patch tracker user, Debsources + Pre-conditions: User is at the index of the patch tracker + Normal flow: -1) The user will click on a package prefix -2) Debsources redirects the user to the page containing the list of packages -under that package prefix -3) The user will choose and click the package s/he is interested in -4) Debsources redirects the user to the list of versions of the package the user -selected -5) The user will select and click on a version -6) Debsources redirects the user to the specific page of that package - version -containing the basic information of that package, the summary of patches and the -list of patches. + +1. The user will click on a package prefix +2. Debsources redirects the user to the page containing the list of packages under that + package prefix +3. The user will choose and click the package s/he is interested in +4. Debsources redirects the user to the list of versions of the package the user + selected +5. The user will select and click on a version +6. Debsources redirects the user to the specific page of that package - version + containing the basic information of that package, the summary of patches and the list + of patches. Alternate flow: -1a) The user uses the search form to search for a package - 2) Debsources will find the exact matches and other results of that search - 3) Continue to the normal flow -1b) The user will choose to view the page with all the packages - 2) Debsources will redirect the user to a page where all the packages are - listed - 3) Continue to the normal flow +1a. The user uses the search form to search for a package + +2a. Debsources will find the exact matches and other results of that search + +3a. Continue to the normal flow + +1b. The user will choose to view the page with all the packages + +2b. Debsources will redirect the user to a page where all the packages are listed + +3b. Continue to the normal flow + +### Id: Case#2 -Id: Case#2 Name: Download raw patch + Actors: Patch tracker user, Debsources + Pre-conditions: User is at the summary of a package + Normal flow: -1) The user will click on the download link of a specific patch -2) Debsources provides the user with the raw patch to save locally -Id: Case#3 +1. The user will click on the download link of a specific patch +2. Debsources provides the user with the raw patch to save locally + +### Id: Case#3 + Name: View a path + Actors: Patch tracker user, Debsources + Pre-conditions: User is at the summary of a package + Normal flow: -1) The user will select a patch applied by the Debian maintainer -2) Debsources redirects the user to the page containing the patch -3) Debsources highlights the syntax of the patch + +1. The user will select a patch applied by the Debian maintainer +2. Debsources redirects the user to the page containing the patch +3. Debsources highlights the syntax of the patch Alternate flow: -1a) The user chooses an upstream patch -3a) The user disables Javascript and the user views a plain dump of the patch -without syntax highlighting -Id: Case#4 +1a. The user chooses an upstream patch + +3a. The user disables Javascript and the user views a plain dump of the patch without +syntax highlighting + +### Id: Case#4 + Name: View diff of orig.tar.gz and Debian package + Actors: Upstream author, Debsources + Pre-conditions: User is at the summary of a package + Normal flow: -1) The user will select a patch applied by the Debian maintainer -2) Debsources redirects the user to the page containing the patch -3) Debsources highlights the syntax of the patch + +1. The user will select a patch applied by the Debian maintainer +2. Debsources redirects the user to the page containing the patch +3. Debsources highlights the syntax of the patch Alternate flow: -1a) The user chooses an upstream patch -3a) The user disables Javascript and the user views a plain dump of the patch -without syntax highlighting \ No newline at end of file + +1a. The user chooses an upstream patch + +3a. The user disables Javascript and the user views a plain dump of the patch without +syntax highlighting diff --git a/doc/patches.debian.net-api.md b/doc/patches.debian.net-api.md index 141649ad..2d84f7c9 100644 --- a/doc/patches.debian.net-api.md +++ b/doc/patches.debian.net-api.md @@ -1,102 +1,88 @@ -Navigation -================ +# Navigation -By prefix ----------- +## By prefix -* /api/prefix/PREFIX/ -Retrieve the packages under a prefix +- /api/prefix/PREFIX/ Retrieve the packages under a prefix -By list ----------- +## By list -* /list/INT/ -Retrieve a list of paginated packages. -INT is an integer to indicate the page number +- /list/INT/ Retrieve a list of paginated packages. INT is an integer to indicate the + page number +# Package Summary API -Package Summary API -================ +The package summary API allows the user to retrieve patch related information providing: -The package summary API allows the user to retrieve patch related information -providing: +- a package, version -* a package, version +## URL schema +- /patches/api/summary/PACKAGE/VERSION/ + - Version can be a suite alias (jessie), a package version or the keyword "latest" -URL schema ----------- +## JSON structure -* /patches/api/summary/PACKAGE/VERSION/ - * Version can be a suite alias (jessie), a package version or the keyword - "latest" - - -JSON structure ---------------- - -The API returns the package related information such as the package, the -version, the format, a checksum of the orig.tar.gz as well as list of patches applied in the package along with some other useful information such as the -file deltas, the description and the download url. +The API returns the package related information such as the package, the version, the +format, a checksum of the orig.tar.gz as well as list of patches applied in the package +along with some other useful information such as the file deltas, the description and +the download url. The above information are to be inserted in the following JSON structure +```json { - results: - { - package: "----", - version: "----", - format: "----", - orig_checksum: "----", - patches: [ - { - name: "----", - url: "----" - }, - { - name: "----", - url: "----" - }, - { - name: "----", - url: "----" - }, - ] - }, + "results": { + "package": "----", + "version": "----", + "format": "----", + "orig_checksum": "----", + "patches": [ + { + "name": "----", + "url": "----" + }, + { + "name": "----", + "url": "----" + }, + { + "name": "----", + "url": "----" + } + ] + } } +``` -Patch API -================ +# Patch API This API allows the user to retrieve details of a single patch -URL schema ----------- +## URL schema -* /patches/api/patch/PACKAGE/VERSION/PATCH_PATH +- /patches/api/patch/PACKAGE/VERSION/PATCH_PATH -The PATCH_PATH is the path of the patch __inside__ the debian/patches folder. -Version can be a suite alias (jessie), a package version or the keyword -"latest" +The PATCH_PATH is the path of the patch **inside** the debian/patches folder. Version +can be a suite alias (jessie), a package version or the keyword "latest" -JSON structure ---------------- +## JSON structure -The results are homogeneous, with the package summary API. +The results are homogeneous, with the package summary API. +```json { - results: - { - package: "----", - version: "----", - format: "----", - orig_checksum: "----", - name: "----", - file_deltas: "----", - description: "----", - url: "----" - } + "results": { + "package": "----", + "version": "----", + "format": "----", + "orig_checksum": "----", + "name": "----", + "file_deltas": "----", + "description": "----", + "url": "----" + } } +``` -The JSON structure is identical to the summary package one with the results -field. This enables the user to parse the API with a single tool. +The JSON structure is identical to the summary package one with the results field. This +enables the user to parse the API with a single tool. diff --git a/doc/postgres.md b/doc/postgres.md index dfb709ce..1f978eeb 100644 --- a/doc/postgres.md +++ b/doc/postgres.md @@ -1,62 +1,66 @@ -Roles management -================ +# Roles management -* The updater must have read-write rights on debsources tables and - sequences. To enable it, run for example in a psql session: -# grant select,insert, update, delete on all tables in schema public to debsource_updater; -# grant select, update on all sequences in schema public to debsource_updater; +- The updater must have read-write rights on debsources tables and sequences. To enable + it, run for example in a psql session: -* The web application must have read rights on debsources tables: -# grant select on all tables in schema public to debsource_webapp; +```sql +grant select,insert, update, delete on all tables in schema public to debsource_updater; +grant select, update on all sequences in schema public to debsource_updater; +``` + +- The web application must have read rights on debsources tables: + +```sql +grant select on all tables in schema public to debsource_webapp; +``` You can specify in your config files different `db_uri` in different sections. -Performance tuning -================== +# Performance tuning https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server -buffers -------- +## buffers -# sysctl -w kernel.shmmax=17179869184 -# sysctl -w kernel.shmall=4194304 +```shell +sysctl -w kernel.shmmax=17179869184 +sysctl -w kernel.shmall=4194304 +``` then save into /etc/sysctl.conf shared_buffers = 12 GB - -cache ------ +## cache effective_cache_size = 16GB +## checkpoints -checkpoints ------------ - -checkpoint_segments = 256 # i.e. every 4 GB - +checkpoint_segments = 256 # i.e. every 4 GB -Trigram index -============= +# Trigram index http://www.postgresql.org/docs/9.1/static/pgtrgm.html -To enable trigram indexes (used for the file table) you'll need, on a per DB -basis: +To enable trigram indexes (used for the file table) you'll need, on a per DB basis: - CREATE EXTENSION pg_trgm; +```sql +CREATE EXTENSION pg_trgm; +``` Then, for instance: - CREATE INDEX ix_files_path_trgm - ON files - USING gin (encode(path, 'escape') gin_trgm_ops); +```sql +CREATE INDEX ix_files_path_trgm +ON files +USING gin (encode(path, 'escape') gin_trgm_ops); +``` which can be queried efficiently using queries like: - SELECT * - FROM files - WHERE encode(path, 'escape') LIKE '%stdio%'; +```sql +SELECT * +FROM files +WHERE encode(path, 'escape') LIKE '%stdio%'; +``` diff --git a/doc/sources-cache.md b/doc/sources-cache.md index c13d46ef..4355fa6d 100644 --- a/doc/sources-cache.md +++ b/doc/sources-cache.md @@ -1,5 +1,4 @@ -cache/sources.txt - file format -=============================== +# cache/sources.txt - file format The sources cache file, usually located in cache/sources.txt is an always up-to-date cache of the sources currently available in a Debsources instance. @@ -15,31 +14,29 @@ The format is as follows: - the fields are as follows: - - **PACKAGE**: *source* package name + - **PACKAGE**: _source_ package name - **VERSION**: source package version - **AREA**: package archive area, one of `main`, `contrib`, `non-free`, `non-free-firmware` - - **DSC**: absolute path to the corresponding `.dsc` file - - *note*: the fact that the path is absolute is unfortunate; in the future - this might be changed to be a path relative to Debsources mirror dir - (which is a configuration option) + - **DSC**: absolute path to the corresponding `.dsc` file + + _note_: the fact that the path is absolute is unfortunate; in the future + this might be changed to be a path relative to Debsources mirror dir + (which is a configuration option) - **DEST**: absolute path to a directory where the source package is currently available in unpacked form (i.e. the dir that you will obtain by - using `dpkg-source -x` on DSC) - - *note*: this might be changed in the future to be a path relative to - Debsources sources dir (which is a configuration option) + using `dpkg-source -x` on DSC) + + _note_: this might be changed in the future to be a path relative to + Debsources sources dir (which is a configuration option) - **SUITES**: a comma separated list of Debian suites of which the source package is part. Suite names are alphabetically sorted. - -bin/debsources-foreach -=========== +# bin/debsources-foreach The `bin/debsources-foreach` helper script can be used to quickly execute scripts in batch on all available source packages, based on sources.txt content. @@ -48,21 +45,21 @@ Here is an example which just dumps all information available in the source cache, showing the augmented environment that foreach prepares for client code: $ bin/debsources-foreach cache/sources.txt 'echo ; pwd; env | grep DEBSOURCES_' - + /srv/debsources/sources/main/l/ledger/2.6.2-3.1 - DEBSOURCES_DIR=/srv/debsources/sources/main/l/ledger/2.6.2-3.1 - DEBSOURCES_PACKAGE=ledger - DEBSOURCES_DSC=/srv/debsources/testdata/mirror/pool/main/l/ledger/ledger_2.6.2-3.1.dsc - DEBSOURCES_AREA=main - DEBSOURCES_VERSION=2.6.2-3.1 - DEBSOURCES_SUITES=jessie,wheezy,sid - + DEBSOURCES_DIR=/srv/debsources/sources/main/l/ledger/2.6.2-3.1 + DEBSOURCES_PACKAGE=ledger + DEBSOURCES_DSC=/srv/debsources/testdata/mirror/pool/main/l/ledger/ledger_2.6.2-3.1.dsc + DEBSOURCES_AREA=main + DEBSOURCES_VERSION=2.6.2-3.1 + DEBSOURCES_SUITES=jessie,wheezy,sid + /srv/debsources/sources/contrib/n/nvidia-support/20131102+1 - DEBSOURCES_DIR=/srv/debsources/sources/contrib/n/nvidia-support/20131102+1 - DEBSOURCES_PACKAGE=nvidia-support - DEBSOURCES_DSC=/srv/debsources/testdata/mirror/pool/contrib/n/nvidia-support/nvidia-support_20131102+1.dsc - DEBSOURCES_AREA=contrib - DEBSOURCES_VERSION=20131102+1 - DEBSOURCES_SUITES=jessie,sid - + DEBSOURCES_DIR=/srv/debsources/sources/contrib/n/nvidia-support/20131102+1 + DEBSOURCES_PACKAGE=nvidia-support + DEBSOURCES_DSC=/srv/debsources/testdata/mirror/pool/contrib/n/nvidia-support/nvidia-support_20131102+1.dsc + DEBSOURCES_AREA=contrib + DEBSOURCES_VERSION=20131102+1 + DEBSOURCES_SUITES=jessie,sid + [...] diff --git a/doc/suite-archive.md b/doc/suite-archive.md index 64285067..373e6e23 100644 --- a/doc/suite-archive.md +++ b/doc/suite-archive.md @@ -1,7 +1,6 @@ +# Archive format -Archive format -============== - +``` binary source debsources version name release date pkgs pkgs notes @@ -35,5 +34,6 @@ version name release date pkgs pkgs notes 6.0 squeeze 6 Feb 2011 ~29,000 ~15,000 Sources, pool, .dsc 7 wheezy 4 May 2013 ~37,000 ~17,600 Sources, pool, .dsc +``` (some data from https://en.wikipedia.org/wiki/Debian#Timeline) diff --git a/doc/testing.md b/doc/testing.md index 9c0fb2fc..1b3f2822 100644 --- a/doc/testing.md +++ b/doc/testing.md @@ -1,50 +1,43 @@ -Test runner -=========== +# Test runner + +## Run tests locally -Run tests locally ------------------ Debsources test suite is managed with [Nose](https://nose.readthedocs.org/). To run the test suite execute the following from Debsources top-level dir: - $ nosetests3 -v debsources/tests/ + $ nosetests3 -v debsources/tests/ Check [Nose documentation](https://nose.readthedocs.org/en/latest/) for more advanced options, including how to only run specific tests, based on attributes. -Run tests in docker -------------------- +## Run tests in docker You can run debsources test suite in a docker container using the following commands (containers must be running): - $ cd contrib/docker - $ docker compose build - $ docker compose run app /opt/run-tests +$ cd contrib/docker +$ docker compose build +$ docker compose run app /opt/run-tests -Test attributes ---------------- +## Test attributes To get a list of available test attributes you might try something like: - $ rgrep -h @attr debsources/tests/ | tr -d '[:blank:]' | sort -u - + $ rgrep -h @attr debsources/tests/ | tr -d '[:blank:]' | sort -u -Test Postgres DB -================ +# Test Postgres DB To be able to run the tests tagged with attribute 'postgres' you need to have a local PostgreSQL installation with the ability for the user running the tests -to create (and destroy) a database called 'debsources-test'. On that database +to create (and destroy) a database called 'debsources-test'. On that database you should have enough privileges to create/drop tables, and perform select/insert/delete queries. If needed, you can change the name of the test database by changing TEST_DB_NAME in the tesdata.py module. - -Test data -========= +# Test data Large test data for Debsources, including a sample mirror and reference database, are kept in a separate Git repository to avoid cluttering and making @@ -54,34 +47,31 @@ registered as a submodule under the `testdata/` directory. Upon first clone of Debsources repository you might initialize and clone the testdata submodule by executing: - $ git submodule update --init + $ git submodule update --init On successful execution the above will populate `testdata/`. For more information check [Git submodule documentation](http://git-scm.com/docs/git-submodule). - -Maintaining testdata reference DB ---------------------------------- +## Maintaining testdata reference DB When the DB structure changes, or when new packages are added to the test data, the reference DBs contained---in DB dump form---under testdata/ will need to be updated to avoid test failures. Here is the recommended procedures to do that: -1. start with *clean slate*: clean your DB (e.g., `dropdb debsources`) and your +1. start with _clean slate_: clean your DB (e.g., `dropdb debsources`) and your local sources directory (e.g., `rm -rf /srv/debsources/sources`). Then re-inizialize an empty DB (e.g., `createdb debsources; bin/debsources-dbadmin --createdb postgres:///debsources`) -2. do a *full update run* using a version of the code base that is trusted +2. do a _full update run_ using a version of the code base that is trusted enough to create the new reference version of the DB to be used for tests (e.g., `bin/debsources-update -vv`) 3. $ cd testdata/ - $ make distclean # this will remove old DB dumps - $ make dump # this will create fresh DB dumps based on the current - # status of the DB just filled by the updater + $ make distclean # this will remove old DB dumps + $ make dump # this will create fresh DB dumps based on the current # status of the DB just filled by the updater 4. Now do `git diff` (in testdata/) to ensure that the new state of the reference DB is sane. Note: one of the two dump is in Postgres (binary) @@ -93,10 +83,10 @@ updated to avoid test failures. Here is the recommended procedures to do that: 5. If everything is fine: - $ git add db/ # this is under testdata/ + $ git add db/ # this is under testdata/ $ git commit $ git push $ cd .. - $ git add testdata # this is in the main debsources repo + $ git add testdata # this is in the main debsources repo $ git commit $ git push diff --git a/doc/twitter-bootstrap.md b/doc/twitter-bootstrap.md index f3e1a945..45e208ee 100644 --- a/doc/twitter-bootstrap.md +++ b/doc/twitter-bootstrap.md @@ -1,10 +1,8 @@ -Bootstrap -========= +# Bootstrap We use Twitter's Open Source Bootstrap CSS framework to ensure our content displays correctly on mobiles. -Generation ----------- +## Generation See `contrib/bootstrap` for more information on this.