volpino
a18083bd98
euscan: json format output
...
Now "-f json" seems good, it supports the handler type used to retrieve
each version and outputs metadata.
Signed-off-by: volpino <fox91@anche.no>
2012-05-23 16:30:43 +02:00
volpino
8cb19b5a6b
euscan: adding json output
...
Naive json output implmented, probably needs some further tuning
Signed-off-by: volpino <fox91@anche.no>
2012-05-21 22:38:38 +02:00
volpino
8c91855a58
Lovely day for PEP8 and pylint!
2012-04-28 18:16:05 +02:00
Corentin Chary
6a57b44d7c
euscan: force nodejs.org scan
...
Signed-off-by: Corentin Chary <corentincj@iksaif.net>
2012-04-04 14:34:09 +02:00
Corentin Chary
5bd358968a
euscan: re-indent blacklists
...
Signed-off-by: Corentin Chary <corentincj@iksaif.net>
2012-02-20 08:20:54 +01:00
Corentin Chary
277fb4ebe6
euscan: add new robots.txt exceptions
...
Signed-off-by: Corentin Chary <corentincj@iksaif.net>
2011-10-08 08:33:03 +02:00
Corentin Chary
d7f655cdde
euscan: add an optional persistent cache
...
Signed-off-by: Corentin Chary <corentincj@iksaif.net>
2011-10-02 10:04:44 +02:00
Corentin Chary
14971584af
euscan: robots.txt, timeout, user-agent, ...
...
- Add a blacklist for robots.txt, we *want* to scan sourceforge
- Set a user-agent that doesn't looks like a browser
- Handle timeouts more carefully
- If brute force detect too much versions, avoid infinite loops
- Handle redirections more carefully
Signed-off-by: Corentin Chary <corentincj@iksaif.net>
2011-09-21 10:09:50 +02:00
Corentin Chary
8c40a1795c
euscan: blacklist art.gnome.org
...
Signed-off-by: Corentin Chary <corentincj@iksaif.net>
2011-09-10 08:25:39 +02:00
Corentin Chary
a137ef60e3
euscan: respect robots.txt
...
Signed-off-by: Corentin Chary <corentincj@iksaif.net>
2011-09-06 16:32:29 +02:00
Corentin Chary
752fb04425
euscan: shake the code
...
- add custom site handlers
- use a custom user agent
- fix some bugs in management commands
Signed-off-by: Corentin Chary <corentincj@iksaif.net>
2011-08-31 15:38:32 +02:00