Navigation

Browser configuration

You have to configure your web browser to use WebCleaner as a proxy.

Netscape/Mozilla

Select Edit -> Preferences -> Advanced -> Proxies.
Activate Manual proxy configuration.
Under HTTP Proxy enter localhost, the Port is 8080.
Under HTTPS Proxy enter localhost, the Port is 8080.
Under No Proxy for enter localhost, 127.0.0.1.
Click Ok to use your new settings.

Internet Explorer

Select Tools -> Internet Options -> Connections.
Click on LAN Settings. If you have a dialup connection to the internet, select your dialup connection and click on Settings.
Activate Use a proxy server.
If activated, deactivate Bypass proxy server for local addresses.
Click on Advanced.
Under HTTP enter localhost, the Port is 8080.
Under Secure enter localhost, the Port is 8080.
Click Ok to use your new settings.

Opera 6

Select File -> Preferences -> Network -> Proxy servers.
Activate HTTP and enter localhost, the Port is 8080.
Activate HTTPS and enter localhost, the Port is 8080.
Activate Do not use Proxy on the adresses below and enter localhost, 127.0.0.1.
Click Ok to use your new settings.

Proxy filter modules

WebCleaner uses a modular filter design allowing a lot of flexibility for different uses.
Each module has a list if mime types and a list of which parts of request/response challenge it applies to. And each module can be further customized by separate rules in the filter configuration.

BinaryCharFilter

Description

Replace illegal binary characters in HTML code like the quote chars often found in Microsoft pages.

Applies to

Response content data of text/html mime type.

Filter configuration rules

None

Blocker

Description

Block or allow specific sites by URL name. Before matching a URL the hostname and path is unquoted to avoid spoofing attacks.

Applies to

Request url of all mime types.

Filter configuration rules

Block, Allow

Compress

Description

Compression of documents with good compression ratio like HTML, WAV, etc.

Applies to

Response content data of mime types text/*, application/postscript, application/pdf, application/x-dvi, audio/basic, audio/midi, audio/x-wav, image/x-portable-*map, x-world/x-vrml.

Filter configuration rules

None

GifImage

Description

Deanimates GIFs and removes all unwanted GIF image extensions (for example GIF comments).

Applies to

Response content data of mime types image/gif

Filter configuration rules

None

Header

Description

Add, modify and delete HTTP headers of request and response.

Applies to

Request and response headers of all mime types.

Filter configuration rules

Header

ImageReducer

Description

Convert images to low quality JPEG files to reduce bandwidth

Applies to

Response content data of all image types supported by the Python Imaging Library: jpeg, png, gif, bmp, pcx, tiff, xbm, xpm.

Filter configuration rules

None

ImageSize

Description

Remove images with certain width and/or height.

Applies to

Response content data of all image types supported by the Python Imaging Library: jpeg, png, gif, bmp, pcx, tiff, xbm, xpm.

Filter configuration rules

Image

RatingHeader

Description

Parse and evaluate content rating system header values.

Applies to

Response headers of all mime types.

Filter configuration rules

Rating

Replacer

Description

Replace regular expressions in data streams.

Applies to

Response data of html and javascript

Filter configuration rules

Replace

HtmlRewriter

Description

Parse HTML code and rewrite single tags, attributes and values. Execute and filter JavaScript. Parse and filter content rated pages. Filter HTML comments.

Applies to

Response data of html pages

Filter configuration rules

Javascript, Nocomments, Rating, Htmlrewrite

XmlRewriter

Description

Parse XML code and rewrite single tags, attributes and values. Plus there is the ability to filter embedded HTML content, often occuring in RSS feeds.

Applies to

Response data of html pages

Filter configuration rules

Htmlrewrite, Xmlrewrite

VirusFilter

Description

Scan all data with the ClamAv virus scanner (which must be installed on the proxy host). For performance reasonse there is a maximum size of 4 MB. If an object exceeds that size the proxy gives an error.

Applies to

Response data of html pages

Filter configuration rules

Antivirus

Filter configuration rules

Htmlrewrite

Matching

A HTML rewrite rule applies to one specified HTML tag and can replace (or delete if the replacement data is empty) parts of or the complete tag. The tag name is a case insensitive string.
If attributes are given, they must match too before the rule applies.

Action

If there is no replacement given the specified tag part will be removed, else it will be replaced.
Back references to matched subgroups can be specified in the replacement string with a backslash and the subgroup number (ie. \1, \2, ...).

  What it does when replacement is foo
replace part before after
tag <blink>text</blink> footextfoo
tagname <blink>text</blink> <foo>text</foo>
enclosed <blink>text</blink> <blink>foo</blink>
attr <a href="bla">..</a> <a foo>..</a>
attrval <a href="bla">..</a> <a href="foo">..</a>
complete <a href="bla">..</a> foo

If you specified zero or more than one attributes to match, 'attr' and 'attrvalue' replace the first occuring or matching attribute or nothing.

Xmlrewrite

Selector

An XML rewrite rule applies to one specific XML tag and can replace (or delete) parts of or the complete tag. The selector is a simplified XPath expression of the form (/tag)+ where a tag is of the form name([attr=val(,attr=val)*])?. Tag names, attributes and values are case sensitive.
Example: /rss/channel/item/description selects the <description> XML tag in an RSS new feed.

Action

  Defined replacement types
replace type replace value action
rsshtml unused Assumes all text content inside the XML tag is HTML. Only allows certain HTML tags, and filters the HTML data with the Htmlrewrite rules.
remove unused Removes the complete selected XML tag and its content.

Replace

Replace regular expressions in HTML or JavaScript pages.

Block

A block rule specifies regular expressions for urls which must be blocked.
The replacement URL specifies the URL to show when the block matches. If none is given a default block message is shown.
Back references to matched subgroups can be specified in the replacement url with a backslash and the subgroup number (ie. \1, \2, ...).

Allow

An allow rule specifies regular expressions for urls which must be allowed, even if a matching block rule exists.

Header

Modify HTTP headers. If the replacement value is empty, the header is deleted, else it gets replaced or added if it did not exist before.

Image

Block images with a certain size by replacing them with a transparent 1x1 image.

Javascript

Execute and filter JavaScript (JS) in HTML pages using the integrated Spidermonkey JS engine. The filter deletes popups and places dynamic content emitted with document.write() into the HTML file.

Nocomments

Remove comments from HTML source. Comments inside <script> or <style> tags are not removed.

Rating

One activated Rating rule enables the content rating system in WebCleaner. Several distinct content rating services including the one defined by WebCleaner itself can be configured.

Footer... ϖ