You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Maciej Barć 962f26ae05
update setup files
1 year ago
bin Fix brute-force short option 3 years ago
euscanwww processing: fix when the new maintainer name is None 9 years ago
man euscan: shake the code 11 years ago
pym/euscan Fix helper regex for python 3.7 3 years ago
.gitignore gitignore: update 2 years ago
AUTHORS fork 2 years ago
Changelog euscan: better --quiet mode 11 years ago
LICENSE license: rename from copying 2 years ago manifest: update 2 years ago
Makefile update setup files 1 year ago
README.rst readme: update ng 2 years ago
TODO updating TODO 9 years ago update setup files 1 year ago update setup files 1 year ago


What is euscan-ng ?


euscan-ng is a fork of Bernard Cafarelli's euscan:
which is a fork of Corentin Chary's euscan:

Right now euscan-ng and (legacy) euscan cannot be installed (system-wide) on the same system.
euscan-ng is available in src_prepare overlay as a dev package (app-portage/euscan-ng-9999).

This tool allows to check if a given package has new upstream versions or not.
It will use different heuristic to scan upstream and grab new versions and related urls.

This tool was designed to mimic debian's uscan, but there is a major
difference between the two: uscan uses a specific "watch" file that describes
how it should scan packages, while euscan-ng uses only what can already be found
in ebuilds. Of course, we could later add some informations in metadata.xml
to help euscan-ng do its job more efficiently.

euscan-ng heuristics are described in the "How does-it works?" section.


$ euscan amatch

* dev-ruby/amatch-0.2.7 [gentoo]

Ebuild: /home/euscan/local/usr/portage/dev-ruby/amatch/amatch-0.2.7.ebuild
Repository: gentoo
Description: Approximate Matching Extension for Ruby

* SRC_URI is 'mirror://rubygems/amatch-0.2.7.gem'
* Using:

Upstream Version: 0.2.8


$ euscan rsyslog

* app-admin/rsyslog-5.8.5 [gentoo]

Ebuild: /home/euscan/local/usr/portage/app-admin/rsyslog/rsyslog-5.8.5.ebuild
Repository: gentoo
Description: An enhanced multi-threaded syslogd with database support and more.

* SRC_URI is ''
* Scanning:${PV}.tar.gz
* Scanning:
* Generating version from 5.8.5
* Brute forcing:${PV}.tar.gz
* Trying: ... [ !! ]
* Trying: ... [ !! ]
* Trying: ... [ !! ]
* Trying: ... [ ok ]
* Trying: ... [ !! ]
* Trying: ... [ !! ]
* Trying: ... [ ok ]
* Trying: ... [ ok ]
* Trying: ... [ ok ]
* Trying: ... [ !! ]
* Trying: ... [ !! ]
* Trying: ... [ !! ]
* Trying: ... [ !! ]

Upstream Version: 5.9.1
Upstream Version: 5.9.0
Upstream Version: 5.9.3
Upstream Version: 5.9.2

Hidden settings

You can configure some settings using the command line, but the
file of the euscan package contains more settings, including blacklists and
default settings.

Maybe we should add the ability to use /etc/euscan.conf and
~/.config/euscan/euscan.conf to override these settings.

euscan-www: euscan as a service

euscan-www is a web application that aggregates euscan results. For example
there is an instance of euscan-www that monitors gentoo-x86 + some official
overlays currently hosted at .

euscan-www uses django and provides some custom commands to feed the database.
You can use euscan-www on you system tree, or preferably you can use a local
tree to avoid messing with your system.


Install requirements from PyPI using::

$ python develop

Extra dependencies:
* portage python api
* rrdtool[python]

Like any django web app, just start by editing and then run
these two commands.

$ python syncdb
$ python migrate

Now your instance is ready, you can just run this command to browse it.
If you want to host it publicly you should use a real webserver.

$ python runserver

Creating a local tree (optional)

Create a local tree with all that portage (and layman would need).
There is an example in euscanwww/scripts/local-tree/. See
to know what env variables you need to run any portage related command in
this local tree.

Scanning process

The scanning process is done by You should read carefully
this script, and adapt it to your needs. For example it uses gparallel to
launch multiple process at a time, and you should adapt that to your number
of cpu and network bandwith.

Once your is ok, just run it.

$ sh

Custom Django management commands

euscan-www povides some new management commands, here is a short description
of these commands. Use "help" or read to get more informations.

List packages stored in database.

Scan the portage tree and store new packages and versions in the database.

Scan metadata and looks for homepage, maintainers and herds.

Scan upstream package. The prefered way to use this script it to first launch
euscan on some packages, store the result of the file, and feed this command with
the result.

Update statistics and rrd files.

If you deleted your rrd files, this script will use the database to
regen them.

How does it work ?

euscan has different heuristics to scan upstream and provides multiple
"handlers". First, here is a description of the generic handler.

Scanning directories

The first thing to do is to scan directories. It's also what uscan do, but it
uses a file that describe what url and regexp to use to match packages.

euscan uses SRC_URI and tries to find the current version (or part of this version)
in the resolved SRC_URI and generate a regexp from that.

For example for app-accessibility/dash-4.10.1, SRC_URI is::


euscan will scan pages based on this template::${0}.${1}/dasher-${PV}.tar.bz2

Then, from that, it will scan the top-most directory that doesn't depend on
the version, and try to go deeper from here.

Brute force

Like when scanning directories, a template of SRC_URI is built. Then euscan
generate next possible version numbers, and tries to download the url generated
from the template and the new version number.

For example, running euscan on portage/app-accessibility/festival-freebsoft-utils-0.6::

SRC_URI is ''
Template is${PV}.tar.gz
Generate version from 0.6: 0.7, 0.8, 0.10, ...
Try new urls:, etc..


euscan uses blacklist for multiple purposes.

For versions that should not be checked at all. sys-libs/libstdc++-v3-3.4
is good example because it's a package which version will always be 3.4
(Compatibility package for running binaries linked against a pre gcc 3.4 libstdc++).

Some packages are dead, but SRC_URI refers to sources that are still being
updated, for example: sys-kernel/xbox-sources that uses the same sources as
vanilla-sources but is not updated the same way.

For urls that are not browsable. mirror://gentoo/ is a good example: it's
both stupid to scan it and very long/expensive.

Disable brute force on those packages and urls. Most of the time it's because
upstream is broken and will answer HTTP 200 even if the file doesn't exist.

Don't respect robots.txt for these domains (sourcefourge, berlios,

Site handlers

A site handler that uses the Pecl/PEAR rest API

This one uses rubygems's json API

Uses PyPI's XML rpc API.