4 Commits

Author SHA1 Message Date
eedf3c5939 Remove euscanwww leftovers from TODO
Signed-off-by: Alfred Wingate <parona@protonmail.com>
2023-11-16 04:51:30 +02:00
7ac854dc61 Add changelog to python metadata
Signed-off-by: Alfred Wingate <parona@protonmail.com>
2023-11-16 04:43:44 +02:00
0551629a9a Update MANIFEST.in
* Missed in previous changes

Signed-off-by: Alfred Wingate <parona@protonmail.com>
2023-11-16 04:39:25 +02:00
17c4e19bc5 Filter XMLParsedAsHTMLWarnings
* Parsing xhtml sites would trigger it.

Signed-off-by: Alfred Wingate <parona@protonmail.com>
2023-11-16 04:27:57 +02:00
4 changed files with 7 additions and 45 deletions

View File

@ -1,8 +1,8 @@
include AUTHORS
include CHANGELOG.rst
include LICENSE
include README.rst
include TODO
include setup.py
include pyproject.toml
recursive-include bin *
recursive-include man *
recursive-include pym *.py
recursive-include src *.py

37
TODO
View File

@ -50,43 +50,6 @@ euscan
- Propose new remote-id: freecode
e.g.: <remote-id type="freecode">projectname</remote-id>
euscanwww
---------
### misc
- Really fix mails: better formating
- Always keep in db all found versions (when using an API only?). But don't display them if older than current packaged version, except maybe in the "upstream_version" column.
### packages
- Ignore alpha/beta if current is not alpha/beta: per-package setting using metadata.xml ?
- ~arch / stable support: see "models: keywords"
- stabilisation candidates: check stabilizations rules, and see how this can be automated
- set upstream version by hand: will be done after uscan compatiblity
### logs
- Move log models into djeuscanhistory ?
### models
- Repository (added or not, from layman + repositories.xml)
- Arches and Keyword
- Metadata, herds, maintainers and homepage are per-version, not per package. Store it in Version instead.
### djportage (LOW-PRIORITY))
- Create standalone application to scan and represent portage trees in models using work done in:
-- euscan
-- p.g.o: https://github.com/bacher09/gentoo-packages
-- gentoostats: https://github.com/gg7/gentoostats_server/blob/master/gentoostats/stats/models.py
The application should be easy to use, and we should be able to launch the scan process in a celery worker using "logging" for logs.
The application should also be usable for p.g.o and gentoostats later...
The scan process should be faster than the one using euscan. gentoo-packages have some interesting ideas for that (keeping metadata and ebuild hash, etc..)
### API (LOW-PRIORITY)
- Move to tastypie

View File

@ -22,6 +22,7 @@ dynamic = ["version"]
[project.urls]
homepage = "https://gitlab.com/src_prepare/euscan-ng"
changelog = "https://gitlab.com/src_prepare/euscan-ng/-/blob/master/CHANGELOG.rst"
[tool.setuptools]
script-files = ["bin/euscan"]

View File

@ -8,14 +8,11 @@ import re
import urllib.error
import urllib.parse
import urllib.request
import warnings
from urllib.parse import urljoin, urlparse
try:
from BeautifulSoup import BeautifulSoup
except ImportError:
from bs4 import BeautifulSoup
import portage
from bs4 import BeautifulSoup, XMLParsedAsHTMLWarning
from euscan import (
BRUTEFORCE_BLACKLIST_PACKAGES,
@ -65,6 +62,7 @@ def confidence_score(found, original, minimum=CONFIDENCE):
def scan_html(data, url, pattern):
warnings.filterwarnings("ignore", category=XMLParsedAsHTMLWarning)
soup = BeautifulSoup(data, features="lxml")
results = []