nT-Exporter

Export content URLs from newTumbl blog

Autor
ceodoe
Dnevne instalacije
0
Ukupno instalacija
5
Ocene
0 0 0
Verzija
1.3
Napravljeno
02/06/2023
Ažurirano
02/06/2023
Size
4,58 КБ
Licenca
GPLv3
Važi za

Important! This script is now obsolete, as newTumbl has closed down permanently. I will leave the script up in case anyone wants to adapt it to other sites or otherwise reuse the code - you are allowed to do so under the GPLv3 license. The original readme is preserved below.

nT-Exporter

(newTumbl Exporter)

newTumbl.com has no way to export or dump your blog, so here's a script to do it for you. This script produces a text file populated with URLs to all the content on the page you are exporting.

nT-Exporter supports dumping the following pages on nT:

  • Your own blog
  • Other people's blogs
  • Any search page
  • Your drafts
  • Your queue
  • Your feed
  • Your dashboard

How to use this script:

  • Install the script
  • You will notice that there's a new option in the navigation bar that says "Export". Don't click it yet!
  • Navigate to the page you want to dump (see above for supported pages)
  • To make it easy to dump a full page, set the preview mode to "Preview, Small". In the top right of the page, click the preview selection icon (looks like three rows of descending blocks), then choose "Preview, Small". This lets you load the most amount of content into the page at a time
  • In order to dump the URLs, all post blocks you wish to dump need to be loaded in. Press the End button on your keyboard or scroll to the bottom of the page repeatedly, until all the posts you want to dump have loaded. Note that you do not need to wait for the actual image/video to load, just the post blocks containing them, as we only need the URLs to be available to the page.
  • Once all posts/post blocks are available on the page, click the "Export" option in the navigation bar at the top of the page.
  • A text file containing every URL to all the images and videos on the current page will be downloaded
  • A quick guide and tally will be shown as an alert.

Additional information

The script ignores all non-post images. That means no avatars or page icons etc will be included, only those that are part of posts.

The script also ignores duplicates, as long as they are the same file on the server. A pair of posts that have the same image, but point to two different uploads will both be downloaded, but a pair of posts that have the same image, but point to the same upload (i.e. one original post and one reblog of the same post), will only download one copy.

This script does not download the content itself, as there is no reliable mechanism for downloading potentially thousands of files at once in your browser. In order to actually download the media, you will need to feed the downloaded text file into an external downloader, such as wget or JDownloader 2.

wget

Example: wget -nc -i blogname_nT-Exporter-123456.txt

Points of note

  • -nc makes sure that you don't download the same file more than once, in case you want to refresh your dump after some time with a new export (content filenames on nT are static)
  • Replace blogname_nT-Exporter-123456.txt with the filename of your dump text file
  • Be careful with wrapping wget in a loop or using parallel for concurrency, as the servers might start blocking you if you pound them with requests. It's always better to get the files one by one over not getting them at all

JDownloader 2

  • Open JDownloader 2
  • Open the text file
  • Copy all content in the text file
  • JDownloader 2 should pick up your copied text automatically and add them to the LinkGrabber
  • Click "Start all downloads" or "Add all to download list" in order to start downloading the media files

Points of note

  • JDownloader is usually configured to download several files at once from HTTP sources. You might want to change this setting in order not to hammer the servers with requests.