sjehuda / Paper Clip

Published:

Version: 23.12.03+74cff74 updated

Summary: Save selection as clean HTML, Markdown or Text file optimized for printing. This program can also cut and edit text. Hotkey: Command + Shift + S to save as HTML.

Groups:

Homepage: https://openuserjs.org/scripts/sjehuda/Paper_Clip

Support: https://openuserjs.org/scripts/sjehuda/Paper_Clip/issues

Copyright: 2023, Schimon Jehudah (http://schimon.i2p)

License: MIT; https://opensource.org/licenses/MIT

📎 Paper Clip

Save selected content to clean HTML, Markdown or Plain Text.

This userscript saves the common (root) HTML element, of given selection, and all of its child elements into a printing-optimized (x)HTML, Markdown or Text file. This program is useful in cases you want a tightly and quick reference without extra resources and media.


Features

  • Cut text;
  • Edit text as you type;
  • Instant save (no waiting time);
  • Remove all stylesheets and potential distractions;
  • Optimized for annotation, notes and printing;
  • Easy to manipulate XHTML (HTML), Markdown or Text;
  • Strip attributes of tags;
  • Omit privacy compromising contents (frames, media, scripts etc.);
  • Media urls are kept as references in Site Information (see example below);
  • Resulted filesize is as small as can be.

Work in Progress

  • Send text annotations via Email or Jabber/XMPP.

Comparison

Tested on this page.
This is a comparison of common available ways (default behaviour):

Software Complete Page Selection
Save HTML - 14 KiB 4 KiB
Wget - 24 KiB -
Falkon 350 KiB 24 KiB -
Writer (ODT) - 33 KiB 32 KiB
SingleFile 600 KiB - 580 KiB
Save Page WE 1.3 MiB - -

Motivation

This script was written because of the following reasons:

  • Some save-page extensions are not available to Falkon Web Browser.
  • The maintained save-page extensions are designed to save large portions of a webpage in order to make an authentic copy, hence unnecessary data (e.g. css, fonts, images etc.) is pulled, and may cause to a resulted file of 500KB sized up to 5MB sized and above.
  • FocusWriter Word Processor ignores hyperlinks, hence copy and paste task has to be made in a meticulous manner, which is both time consuming and might not always be accurate all the time.
  • LibreOffice takes time to load, so the copy and paste task might take between 30 to 60 seconds.
  • ODT files are often larger than an average subject HTML file.

Example Site Information

Tag Value
url https://www.corbettreport.com/5thgen/
date Tue May 09 2023 15:29:39 GMT+0200
creator i2p.schimon.paperclip
user-agent Mozilla/5.0 (Wayland; Linux postmarketOS) Falkon/23.04.3
content-type-sourced text/html
charset-sourced UTF-8
viewport-imported width=device-width,initial-scale=1
description-imported We are in the middle of a world-changing war. . . .
generator-imported All in One SEO (AIOSEO) 4.3.6.1
extracted-media-audio https://www.corbettreport.com/mp3/episode441_5th_gen.mp3?_=1
extracted-media-iframe https://odysee.com/$/embed/@corbettreport:0/ep441-5thgen:0

Upcoming changes

  • Editable elements element.contentEditable = "true";
  • Remove elements;
  • Convert embedded elements (e.g. iframe) to links; (cancelled)
  • Save links of embedded elements (e.g. iframe) to meta tags; (cancelled)
  • Omit elements with no content (e.g. <tag></tag>); (done)
  • Save images; (cancelled. maybe relevant to HTMLZ)
  • Multiple modes (Markdown, PDF, Screenshot);
  • Save to HTMLZ;
  • Bookmarklet.

🦅 Designed for Falkon web browser‬

📱Designed for postmarketOS Linux

Rating: 0