Reference
from_html(html, root_url=None, include_fallbacks=False)
¶
Extract all favicons in a given HTML.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
html
|
str
|
HTML to parse. |
required |
root_url
|
Optional[str]
|
Root URL where the favicon is located. |
None
|
include_fallbacks
|
bool
|
Whether to include fallback favicons like |
False
|
Returns:
Type | Description |
---|---|
set[Favicon]
|
A set of favicons. |
from_url(url, include_fallbacks=False, client=None)
¶
Extracts favicons from a given URL.
This function attempts to retrieve the specified URL, parse its HTML, and extract any
associated favicons. If the URL is reachable and returns a successful response, the
function will parse the content for favicon references. If include_fallbacks
is True,
it will also attempt to find fallback icons (e.g., by checking default icon paths).
If the URL is not reachable or returns an error response, an empty set is returned.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
url
|
str
|
The URL from which to extract favicons. |
required |
include_fallbacks
|
bool
|
Whether to include fallback favicons if none are explicitly defined. Defaults to False. |
False
|
client
|
Optional[Client]
|
A custom client instance from |
None
|
Returns:
Type | Description |
---|---|
set[Favicon]
|
A set of |
download(favicons, mode='all', include_unknown=True, sleep_time=2, sort='ASC', client=None)
¶
Download previsouly extracted favicons.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
favicons
|
Union[list[Favicon], set[Favicon]]
|
list of favicons to download. |
required |
mode
|
str
|
select the strategy to download favicons.
- |
'all'
|
include_unknown
|
bool
|
include or not images with no width/height information. |
True
|
sleep_time
|
int
|
number of seconds to wait between each requests to avoid blocking. |
2
|
sort
|
str
|
sort favicons by size in ASC or DESC order. Only used for mode |
'ASC'
|
client
|
Optional[Client]
|
A custom client instance from |
None
|
Returns:
Type | Description |
---|---|
list[RealFavicon]
|
A set of favicons. |
guess_size(favicon, chunk_size=512)
¶
Get size of image by requesting first bytes.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
favicon
|
Favicon
|
the favicon object from which to guess the size. |
required |
chunk_size
|
int
|
bytes size to iterate over image stream. |
512
|
Returns:
Type | Description |
---|---|
Tuple[int, int]
|
The guessed width and height |
guess_missing_sizes(favicons, chunk_size=512, sleep_time=1, load_base64_img=False)
¶
Attempts to determine missing dimensions (width and height) of favicons.
For each favicon in the provided collection, if the favicon is a base64-encoded
image (data URL) and load_base64_img
is True, the function decodes and loads
the image to guess its dimensions. For non-base64 favicons with missing or zero
dimensions, the function attempts to guess the size by partially downloading the
icon data (using guess_size
).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
favicons
|
Union[list[Favicon], set[Favicon]]
|
A list or set of |
required |
chunk_size
|
int
|
The size of the data chunk to download for guessing dimensions of non-base64 images. Defaults to 512. |
512
|
sleep_time
|
int
|
The number of seconds to sleep between guessing attempts to avoid rate limits or overloading the server. Defaults to 1. |
1
|
load_base64_img
|
bool
|
Whether to decode and load base64-encoded images (data URLs) to determine their dimensions. Defaults to False. |
False
|
Returns:
Type | Description |
---|---|
list[Favicon]
|
A list of |
check_availability(favicons, sleep_time=1, client=None)
¶
Checks the availability and final URLs of a collection of favicons.
For each favicon in the provided list or set, this function sends a head request
(or an optimized request if available) to check whether the favicon's URL is
reachable. If the favicon is reachable, its reachable
attribute is updated to
True. If the request results in a redirect, the favicon's URL is updated to the
final URL.
A delay (sleep_time
) can be specified between checks to avoid rate limits
or overloading the server.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
favicons
|
Union[list[Favicon], set[Favicon]]
|
A collection of |
required |
sleep_time
|
int
|
Number of seconds to sleep between each availability check to control request rate. Defaults to 1. |
1
|
client
|
Optional[Client]
|
A custom client instance from |
None
|
Returns:
Type | Description |
---|---|
list[Favicon]
|
A list of |