Go to file
Alfred Wingate 22915bade5
Fix kde handler
* It appears it was broken in the midst of 8d912379, no apparent
  rationale for it being changed there.

Signed-off-by: Alfred Wingate <parona@protonmail.com>
2023-11-16 07:07:11 +02:00
bin Enable flake8-bugbear linting and fix raised issues 2023-11-16 06:25:40 +02:00
src/euscan Fix kde handler 2023-11-16 07:07:11 +02:00
.git-blame-ignore-revs .git-blame-ignore-revs: Ignore mass formatting commits 2023-11-16 00:16:19 +02:00
.gitignore Get version from git with setuptools_scm 2023-11-15 23:30:39 +02:00
.gitlab-ci.yml Add publishing capability to CI 2023-11-16 00:16:19 +02:00
.pre-commit-config.yaml Add pre-commit config 2023-11-15 23:30:40 +02:00
AUTHORS Remove the djano webservice in its entirety 2023-11-15 23:30:06 +02:00
CHANGELOG.rst Update changelog (at least the 1.0.1 changes) 2023-11-16 01:41:53 +02:00
LICENSE Use a cleaned up copy of the license 2023-11-16 00:16:19 +02:00
MANIFEST.in Update MANIFEST.in 2023-11-16 04:39:25 +02:00
README.rst Remove the djano webservice in its entirety 2023-11-15 23:30:06 +02:00
TODO Remove euscanwww leftovers from TODO 2023-11-16 04:51:30 +02:00
pyproject.toml Enable flake8-bugbear linting and fix raised issues 2023-11-16 06:25:40 +02:00


What is euscan-ng ?


euscan-ng is a fork of Bernard Cafarelli's euscan: https://github.com/voyageur/euscan
which is a fork of Corentin Chary's euscan: https://github.com/iksaif/euscan

Right now euscan-ng and (legacy) euscan cannot be installed (system-wide) on the same system.
euscan-ng is available in src_prepare overlay as a dev package (app-portage/euscan-ng-9999).

This tool allows to check if a given package has new upstream versions or not.
It will use different heuristic to scan upstream and grab new versions and related urls.

This tool was designed to mimic debian's uscan, but there is a major
difference between the two: uscan uses a specific "watch" file that describes
how it should scan packages, while euscan-ng uses only what can already be found
in ebuilds. Of course, we could later add some informations in metadata.xml
to help euscan-ng do its job more efficiently.

euscan-ng heuristics are described in the "How does-it works?" section.


    $ euscan amatch

     * dev-ruby/amatch-0.2.7 [gentoo]

    Ebuild: /home/euscan/local/usr/portage/dev-ruby/amatch/amatch-0.2.7.ebuild
    Repository: gentoo
    Homepage: http://flori.github.com/amatch/
    Description: Approximate Matching Extension for Ruby

     * SRC_URI is 'mirror://rubygems/amatch-0.2.7.gem'
     * Using: http://rubygems.org/api/v1/versions/amatch.json

    Upstream Version: 0.2.8 http://rubygems.org/gems/amatch-0.2.8.gem


    $ euscan rsyslog

     * app-admin/rsyslog-5.8.5 [gentoo]

    Ebuild: /home/euscan/local/usr/portage/app-admin/rsyslog/rsyslog-5.8.5.ebuild
    Repository: gentoo
    Homepage: http://www.rsyslog.com/
    Description: An enhanced multi-threaded syslogd with database support and more.

     * SRC_URI is 'http://www.rsyslog.com/files/download/rsyslog/rsyslog-5.8.5.tar.gz'
     * Scanning: http://www.rsyslog.com/files/download/rsyslog/rsyslog-${PV}.tar.gz
     * Scanning: http://www.rsyslog.com/files/download/rsyslog
     * Generating version from 5.8.5
     * Brute forcing: http://www.rsyslog.com/files/download/rsyslog/rsyslog-${PV}.tar.gz
     * Trying: http://www.rsyslog.com/files/download/rsyslog/rsyslog-5.8.6.tar.gz ...        [ !! ]
     * Trying: http://www.rsyslog.com/files/download/rsyslog/rsyslog-5.8.7.tar.gz ...        [ !! ]
     * Trying: http://www.rsyslog.com/files/download/rsyslog/rsyslog-5.8.8.tar.gz ...        [ !! ]
     * Trying: http://www.rsyslog.com/files/download/rsyslog/rsyslog-5.9.0.tar.gz ...        [ ok ]
     * Trying: http://www.rsyslog.com/files/download/rsyslog/rsyslog-5.10.0.tar.gz ...         [ !! ]
     * Trying: http://www.rsyslog.com/files/download/rsyslog/rsyslog-5.11.0.tar.gz ...         [ !! ]
     * Trying: http://www.rsyslog.com/files/download/rsyslog/rsyslog-5.9.1.tar.gz ...        [ ok ]
     * Trying: http://www.rsyslog.com/files/download/rsyslog/rsyslog-5.9.2.tar.gz ...        [ ok ]
     * Trying: http://www.rsyslog.com/files/download/rsyslog/rsyslog-5.9.3.tar.gz ...        [ ok ]
     * Trying: http://www.rsyslog.com/files/download/rsyslog/rsyslog-5.12.0.tar.gz ...         [ !! ]
     * Trying: http://www.rsyslog.com/files/download/rsyslog/rsyslog-5.9.4.tar.gz ...        [ !! ]
     * Trying: http://www.rsyslog.com/files/download/rsyslog/rsyslog-5.9.5.tar.gz ...        [ !! ]
     * Trying: http://www.rsyslog.com/files/download/rsyslog/rsyslog-5.9.6.tar.gz ...        [ !! ]

    Upstream Version: 5.9.1 http://www.rsyslog.com/files/download/rsyslog/rsyslog-5.9.1.tar.gz
    Upstream Version: 5.9.0 http://www.rsyslog.com/files/download/rsyslog/rsyslog-5.9.0.tar.gz
    Upstream Version: 5.9.3 http://www.rsyslog.com/files/download/rsyslog/rsyslog-5.9.3.tar.gz
    Upstream Version: 5.9.2 http://www.rsyslog.com/files/download/rsyslog/rsyslog-5.9.2.tar.gz

Hidden settings

You can configure some settings using the command line, but the __init__.py
file of the euscan package contains more settings, including blacklists and
default settings.

Maybe we should add the ability to use /etc/euscan.conf and
~/.config/euscan/euscan.conf to override these settings.

How does it work ?

euscan has different heuristics to scan upstream and provides multiple
"handlers". First, here is a description of the generic handler.

Scanning directories

The first thing to do is to scan directories. It's also what uscan do, but it
uses a file that describe what url and regexp to use to match packages.

euscan uses SRC_URI and tries to find the current version (or part of this version)
in the resolved SRC_URI and generate a regexp from that.

For example for app-accessibility/dash-4.10.1, SRC_URI is::


euscan will scan pages based on this template::


Then, from that, it will scan the top-most directory that doesn't depend on
the version, and try to go deeper from here.

Brute force

Like when scanning directories, a template of SRC_URI is built. Then euscan
generate next possible version numbers, and tries to download the url generated
from the template and the new version number.

For example, running euscan on portage/app-accessibility/festival-freebsoft-utils-0.6::

  SRC_URI is 'http://www.freebsoft.org/pub/projects/festival-freebsoft-utils/festival-freebsoft-utils-0.6.tar.gz'
  Template is http://www.freebsoft.org/pub/projects/festival-freebsoft-utils/festival-freebsoft-utils-${PV}.tar.gz
  Generate version from 0.6: 0.7, 0.8, 0.10, ...
  Try new urls: http://www.freebsoft.org/pub/projects/festival-freebsoft-utils/festival-freebsoft-utils-0.7.tar.gz, etc..


euscan uses blacklist for multiple purposes.

  For versions that should not be checked at all. sys-libs/libstdc++-v3-3.4
  is good example because it's a package which version will always be 3.4
  (Compatibility package for running binaries linked against a pre gcc 3.4 libstdc++).

  Some packages are dead, but SRC_URI refers to sources that are still being
  updated, for example: sys-kernel/xbox-sources that uses the same sources as
  vanilla-sources but is not updated the same way.

  For urls that are not browsable. mirror://gentoo/ is a good example: it's
  both stupid to scan it and very long/expensive.

  Disable brute force on those packages and urls. Most of the time it's because
  upstream is broken and will answer HTTP 200 even if the file doesn't exist.

  Don't respect robots.txt for these domains (sourcefourge, berlios, github.com).

Site handlers

  A site handler that uses the Pecl/PEAR rest API

  This one uses rubygems's json API

  Uses PyPI's XML rpc API.